We read the file using Pandas.
import pandas as pd import numpy as np rawData = pd.read_csv('data-Assignment2.txt', sep=",", header=None)
We need to find the signature matrix. For that we need to make a permutation of the rows of the whole matrix. We can do that using pandas like this.
permuteData = rawData.sample(frac=1)
Just as a note we can use frac less than one if we want to do a random subsample. We can also shuffle in-place and use this.
df = df.sample(frac=1).reset_index(drop=True) # in place shuffle, drop index column
We can test if it works by using a random matrix created by Pandas.
# create a random matrix with 0 and 1, like our example matrix df = pd.DataFrame(np.random.randint(0,2,size=(100, 4)), columns=list('ABCD')) # now we can do a shuffle like this df = df.sample(frac=1)
The before and after is shown by the following figure:
a =  b =  for k in range(3): for j in range(4): a.append(j) b.append(a) a =  print(b) # OUTPUT: [[0, 1, 2, 3], [0, 1, 2, 3], [0, 1, 2, 3]]