Finden der verschiedenen möglichen Kombinationen

Question 1

Verwendung von awk:

function wprint() {
    print w[1], w[2], w[3];
}

function wshift(e) {
    w[1] = w[2]; w[2] = w[3]; w[3] = e;
}

BEGIN { FS = OFS = "," }

{
    wshift($1);
    wshift($2);
    wshift($3);
    wprint();

    for (i = 4; i <= NF; ++i) {
        wshift($i);
        wprint();
    }
}

Dann:

$ awk -f script data.in
A,B,C
B,C,D
C,D,E
P,Q,R
G,D,V
D,V,K
L,Q,X
Q,X,I
X,I,U
I,U,G

Das awkSkript verwendet ein verschiebbares Fenster mit drei Elementen w. Für jede Eingabezeile füllt es die drei Elemente des Fensters mit den ersten drei Feldern und druckt diese als kommagetrennte Liste (gefolgt von einem Zeilenumbruch). Anschließend iteriert es über die verbleibenden Felder in der Zeile, verschiebt sie in das Fenster und druckt das Fenster für jedes Element.

Wenn eine Zeile in den Eingabedaten weniger als zwei Felder enthält, erhalten Sie Folgendes:

A,,

oder

A,B,

in der Ausgabe.

Wenn Sie sicher sind, dass jede Eingabezeile mindestens drei Felder hat (oder wenn Sie alle Zeilen ignorieren möchten, bei denen das nicht der Fall ist), können Sie das awkSkript etwas kürzen:

function wprint() {
    print w[1], w[2], w[3];
}

function wshift(e) {
    w[1] = w[2]; w[2] = w[3]; w[3] = e;
}

BEGIN { FS = OFS = "," }

{
    for (i = 1; i <= NF; ++i) {
        wshift($i);
        if (i >= 3) {
            wprint();
        }
    }
}

Eine Verallgemeinerung der ersten Variante des Skripts mit variabler Fenstergröße:

function wprint(i) {
    for (i = 1; i < n; ++i) {
        printf("%s%s", w[i], OFS);
    }
    print w[n]
}

function wshift(e,i) {
    for (i = 1; i < n; ++i) {
        w[i] = w[i + 1];
    }
    w[n] = e;
}

BEGIN { FS = OFS = "," }

{
    for (i = 1; i <= n; ++i) {
        wshift($i);
    }
    wprint();

    for (i = n + 1; i <= NF; ++i) {
        wshift($i);
        wprint();
    }
}

Es benutzen:

$ awk -v n=4 -f script data.in
A,B,C,D
B,C,D,E
P,Q,R,
G,D,V,K
L,Q,X,I
Q,X,I,U
X,I,U,G

Answer

Verwendung von awk:

function wprint() {
    print w[1], w[2], w[3];
}

function wshift(e) {
    w[1] = w[2]; w[2] = w[3]; w[3] = e;
}

BEGIN { FS = OFS = "," }

{
    wshift($1);
    wshift($2);
    wshift($3);
    wprint();

    for (i = 4; i <= NF; ++i) {
        wshift($i);
        wprint();
    }
}

Dann:

$ awk -f script data.in
A,B,C
B,C,D
C,D,E
P,Q,R
G,D,V
D,V,K
L,Q,X
Q,X,I
X,I,U
I,U,G

Das awkSkript verwendet ein verschiebbares Fenster mit drei Elementen w. Für jede Eingabezeile füllt es die drei Elemente des Fensters mit den ersten drei Feldern und druckt diese als kommagetrennte Liste (gefolgt von einem Zeilenumbruch). Anschließend iteriert es über die verbleibenden Felder in der Zeile, verschiebt sie in das Fenster und druckt das Fenster für jedes Element.

Wenn eine Zeile in den Eingabedaten weniger als zwei Felder enthält, erhalten Sie Folgendes:

A,,

oder

A,B,

in der Ausgabe.

Wenn Sie sicher sind, dass jede Eingabezeile mindestens drei Felder hat (oder wenn Sie alle Zeilen ignorieren möchten, bei denen das nicht der Fall ist), können Sie das awkSkript etwas kürzen:

function wprint() {
    print w[1], w[2], w[3];
}

function wshift(e) {
    w[1] = w[2]; w[2] = w[3]; w[3] = e;
}

BEGIN { FS = OFS = "," }

{
    for (i = 1; i <= NF; ++i) {
        wshift($i);
        if (i >= 3) {
            wprint();
        }
    }
}

Eine Verallgemeinerung der ersten Variante des Skripts mit variabler Fenstergröße:

function wprint(i) {
    for (i = 1; i < n; ++i) {
        printf("%s%s", w[i], OFS);
    }
    print w[n]
}

function wshift(e,i) {
    for (i = 1; i < n; ++i) {
        w[i] = w[i + 1];
    }
    w[n] = e;
}

BEGIN { FS = OFS = "," }

{
    for (i = 1; i <= n; ++i) {
        wshift($i);
    }
    wprint();

    for (i = n + 1; i <= NF; ++i) {
        wshift($i);
        wprint();
    }
}

Es benutzen:

$ awk -v n=4 -f script data.in
A,B,C,D
B,C,D,E
P,Q,R,
G,D,V,K
L,Q,X,I
Q,X,I,U
X,I,U,G

Question 2

Mit perl:

perl -F, -le 'BEGIN { $, = "," } while(@F >= 3) { print @F[0..2]; shift @F }' file

Mit awk:

awk -F, -v OFS=, 'NF>=3 { for(i=1; i<=NF-2; i++) print $i, $(i+1), $(i+2) }' file

Answer

Mit perl:

perl -F, -le 'BEGIN { $, = "," } while(@F >= 3) { print @F[0..2]; shift @F }' file

Mit awk:

awk -F, -v OFS=, 'NF>=3 { for(i=1; i<=NF-2; i++) print $i, $(i+1), $(i+2) }' file

Question 3

Mit Perl können wir es folgendermaßen angehen:

perl -lne '/(?:([^,]+)(?=((?:,[^,]+){2}))(?{ print $1,$2 }))*$/' yourfile
perl -F, -lne '$,=","; print shift @F, @F[0..1] while @F >= 3' 
perl -F, -lne '$,=","; print splice @F, 0, 3, @F[1,2] while @F >= 3'

die sich wie folgt erweitert ausdrücken lässt:

perl -lne '
   m/
      (?:                       # set up a do-while loop
         ([^,]+)                # first field which shall be deleted after printing
         (?=((?:,[^,]+){2}))    # lookahead and remember the next 2 fields
         (?{ print $1,$2 })     # print the first field + next 2 fields
      )*                        # loop back for more
      $                         # till we hit the end of line
   /x;
' yourfile

Und mit sed können wir dies mit einer Auswahl seiner Befehle tun:

sed -e '
   /,$/!s/$/,/     # add a dummy comma at the EOL

   s/,/\n&/3;ta    # while there still are 3 elements in the line jump to label "a"
   d               # else quit processing this line any further

   :a              # main action
   P               # print the leading portion, i.e., that which is left of the first newline in the pattern space
   s/\n//          # take away the marker

   s/,/\n/;tb      # get ready to delete the first field
   :b

   D               # delete the first field, and apply the sed code all over from the beginning to what remains in the pattern space
' yourfile

Dc kann auch Folgendes:

sed -e 's/[^,]*/[&]/g;y/,/ /' gene_data.in |
dc -e '
[q]sq                            # macro for quitting
[SM z0<a]sa                      # macro to store stack -> register "M"
[LMd SS zlk>b c]sb               # macro to put register "M" -> register "S"
[LS zlk>c]sc                     # macro to put register "S" -> stack
[n44an dn44an rdn10anr z3!>d]sd  # macro to print 1st three stack elements
[zsk lax lbx lcx ldx c]se        # macro that initializes & calls all other macros
[?z3>q lex z0=?]s?               # while loop to read in file line by line and run macro "e" on each line
l?x                              # main()
'

Ergebnisse

A,B,C
B,C,D
C,D,E
D,E,F
E,F,G
P,Q,R
G,D,V
D,V,K
L,Q,X
Q,X,I
X,I,U
I,U,G

Answer

Mit Perl können wir es folgendermaßen angehen:

perl -lne '/(?:([^,]+)(?=((?:,[^,]+){2}))(?{ print $1,$2 }))*$/' yourfile
perl -F, -lne '$,=","; print shift @F, @F[0..1] while @F >= 3' 
perl -F, -lne '$,=","; print splice @F, 0, 3, @F[1,2] while @F >= 3'

die sich wie folgt erweitert ausdrücken lässt:

perl -lne '
   m/
      (?:                       # set up a do-while loop
         ([^,]+)                # first field which shall be deleted after printing
         (?=((?:,[^,]+){2}))    # lookahead and remember the next 2 fields
         (?{ print $1,$2 })     # print the first field + next 2 fields
      )*                        # loop back for more
      $                         # till we hit the end of line
   /x;
' yourfile

Und mit sed können wir dies mit einer Auswahl seiner Befehle tun:

sed -e '
   /,$/!s/$/,/     # add a dummy comma at the EOL

   s/,/\n&/3;ta    # while there still are 3 elements in the line jump to label "a"
   d               # else quit processing this line any further

   :a              # main action
   P               # print the leading portion, i.e., that which is left of the first newline in the pattern space
   s/\n//          # take away the marker

   s/,/\n/;tb      # get ready to delete the first field
   :b

   D               # delete the first field, and apply the sed code all over from the beginning to what remains in the pattern space
' yourfile

Dc kann auch Folgendes:

sed -e 's/[^,]*/[&]/g;y/,/ /' gene_data.in |
dc -e '
[q]sq                            # macro for quitting
[SM z0<a]sa                      # macro to store stack -> register "M"
[LMd SS zlk>b c]sb               # macro to put register "M" -> register "S"
[LS zlk>c]sc                     # macro to put register "S" -> stack
[n44an dn44an rdn10anr z3!>d]sd  # macro to print 1st three stack elements
[zsk lax lbx lcx ldx c]se        # macro that initializes & calls all other macros
[?z3>q lex z0=?]s?               # while loop to read in file line by line and run macro "e" on each line
l?x                              # main()
'

Ergebnisse

A,B,C
B,C,D
C,D,E
D,E,F
E,F,G
P,Q,R
G,D,V
D,V,K
L,Q,X
Q,X,I
X,I,U
I,U,G

Finden der verschiedenen möglichen Kombinationen

Antwort1

Antwort2

Antwort3

Ergebnisse

verwandte Informationen