How can I do filtering between two matrix?

Owen 11/08/2018. 7 answers, 600 views
text-processing

File1:

91  23  56  44  87  77
99  34  56  22  22  95
41  88  26  79  60  27
95  55  66  69  92  25

File2:

pass fail pass pass pass fail
pass fail pass fail fail pass
pass pass fail pass pass fail
pass pass fail pass pass fail

As I want to sum up the total fail marks for each row, here is the expected output.

output:

100
78
53
91

I would like to ask that how can I do the filtering on file1 based on the word "fail" in file2 in order to get the sum of fail marks.

7 Answers


RudiC 11/08/2018.

I don't think you need an END section:

awk '
NR == FNR       {for (i=1; i<=NF; i++) F[i,NR] = $i
                 next
                }
                {T = 0
                 for (i=1; i<=NF; i++) T += ($i=="fail")?F[i,FNR]:0
                 print T
                }
' file[12]
100
78
53
91

Thor 11/08/2018.

I would use a matrix language for such a task, e.g. GNU Octave.

Assuming you converted the pass/fail file into numerical values, e.g.:

sed 's/pass/1/g; s/fail/0/g' passfail > passfail.nums

You can now do the following:

marks    = dlmread('marks');
passfail = dlmread('passfail.nums');

for i = 1:size(marks)(1)
  sum(marks(i,:)(passfail(i,:) == 0))
end

Output:

ans =  100
ans =  78
ans =  53
ans =  91

Maxim 11/08/2018.

While I think using awk is good for portability, other languages seem easier to write and read for this task. GNU Octave was mentioned but does not come pre-installed on most machines. On the other hand, most systems have a version of python preinstalled. Here is a python version:

for marks, decisions in zip(open('file1').readlines(), open('file2').readlines()):
    row_score = 0
    for mark, decision in zip(marks.split(), decisions.split()):
        if decision == 'fail':
            row_score += int(mark)
    print(row_score)

which returns the outputs you expected.


jimmij 11/08/2018.

Here is my awk approach:

awk 'NR==FNR{for(i=1;i<=NF;i++) a[NR"-"i]=$i; next} \
            {for(j=1;j<=NF;j++) if($j=="fail") b[FNR]+=a[FNR"-"j]} \
         END{for(k in b) print b[k]}' file1 file2

Awk doesn't support two-dimensional arrays, so we cooked ones by combining two numbers (row and field) in the same array index. The output is:

100
78
53
91

mosvy 11/08/2018.
awk '
  BEGIN{ pf=ARGV[2]; ARGV[2]="" }
  { getline l <pf; split(l, a); n=0;
    for(i=1;i<=NF;i++) if(a[i]=="fail") n+=$i;
    print n }
' file1 file2
100
78
53
91

Just like @Maxim's python version, but unlike all the other answers, this is processing the two files in parallel, line by line, instead of loading one of them whole into memory.


Inian 11/08/2018.

I guess using an Awk script would make this requirement a bit easy to solve. Do something like below. I guess its a bit slower than now posted jimmij's answer

#!/usr/bin/awk -f


FNR == NR {
    for(i=1;i<=NF;i++)
        if ( $i == "fail")
            idxArray[FNR] = (idxArray[FNR]) ? (idxArray[FNR]" "i):(i)
        next
}{
    delete Array
    delete Line
    i=""
    j=""
    sum=""
    n=split(idxArray[FNR],Array," ")
    l=split($0,Line," ")
    for (i=1;i<=n;i++)
        for (j=1;j<=l;j++)
            if (Array[i] == j )
                sum += Line[j]
    print sum
}

and run the script as

awk -f script.awk file2 file1

RudiC 11/09/2018.

One-liner:

paste file[12] | awk '{T=0; for (i=1; i<=NF/2; i++) T += ($(i+NF/2)=="fail")?$i:0; print T}'
100
78
53
91

HighResolutionMusic.com - Download Hi-Res Songs

1 (G)I-DLE

POP/STARS flac

(G)I-DLE. 2018. Writer: Riot Music Team;Harloe.
2 The Chainsmokers

Beach House flac

The Chainsmokers. 2018. Writer: Andrew Taggart.
3 Ariana Grande

​Thank U, Next flac

Ariana Grande. 2018. Writer: Crazy Mike;Scootie;Victoria Monét;Tayla Parx;TBHits;Ariana Grande.
4 Nicki Minaj

No Candle No Light flac

Nicki Minaj. 2018. Writer: Denisia “Blu June” Andrews;Kathryn Ostenberg;Brittany "Chi" Coney;Brian Lee;TJ Routon;Tushar Apte;ZAYN;Nicki Minaj.
5 Clean Bandit

Baby flac

Clean Bandit. 2018. Writer: Jack Patterson;Kamille;Jason Evigan;Matthew Knott;Marina;Luis Fonsi.
6 Imagine Dragons

Bad Liar flac

Imagine Dragons. 2018. Writer: Jorgen Odegard;Daniel Platzman;Ben McKee;Wayne Sermon;Aja Volkman;Dan Reynolds.
7 Halsey

Without Me flac

Halsey. 2018. Writer: Halsey;Delacey;Louis Bell;Amy Allen;Justin Timberlake;Timbaland;Scott Storch.
8 BTS

Waste It On Me flac

BTS. 2018. Writer: Steve Aoki;Jeff Halavacs;Ryan Ogren;Michael Gazzo;Nate Cyphert;Sean Foreman;RM.
9 BlackPink

Kiss And Make Up flac

BlackPink. 2018. Writer: Soke;Kny Factory;Billboard;Chelcee Grimes;Teddy Park;Marc Vincent;Dua Lipa.
10 Fitz And The Tantrums

HandClap flac

Fitz And The Tantrums. 2017. Writer: Fitz And The Tantrums;Eric Frederic;Sam Hollander.
11 Backstreet Boys

Chances flac

Backstreet Boys. 2018.
12 Kelly Clarkson

Never Enough flac

Kelly Clarkson. 2018. Writer: Benj Pasek;Justin Paul.
13 Diplo

Close To Me flac

Diplo. 2018. Writer: Ellie Goulding;Savan Kotecha;Peter Svensson;Ilya;Swae Lee;Diplo.
14 Anne-Marie

Rewrite The Stars flac

Anne-Marie. 2018. Writer: Benj Pasek;Justin Paul.
15 Little Mix

Woman Like Me flac

Little Mix. 2018. Writer: Nicki Minaj;Steve Mac;Ed Sheeran;Jess Glynne.
16 Imagine Dragons

Machine flac

Imagine Dragons. 2018. Writer: Wayne Sermon;Daniel Platzman;Dan Reynolds;Ben McKee;Alex Da Kid.
17 Little Mix

The Cure flac

Little Mix. 2018.
18 Bradley Cooper

Always Remember Us This Way flac

Bradley Cooper. 2018. Writer: Lady Gaga;Dave Cobb.
19 Rita Ora

Velvet Rope flac

Rita Ora. 2018.
20 Lady Gaga

I'll Never Love Again flac

Lady Gaga. 2018. Writer: Benjamin Rice;Lady Gaga.

Related questions

Hot questions

Language

Popular Tags