Knowledge Hub: Learning Datascience -Day 2

Sunday, November 17, 2019

Learning Datascience -Day 2

day 2

import numpy as np

a=np.arange(1,11)

RUN

array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])

a1=np.array([[1,2,3],[4,5,6]])

RUN

array([[1, 2, 3], [4, 5, 6]])

to Sum all the elements we run the below command

a1.sum()

RUN

To have column wise sum of elements

a1.sum(axis=0)

RUN

array([5, 7, 9])

To sum row wise

a1.sum(axis=1)

RUN

array([ 6, 15])

b1=np.array([

[10,20,30],

[40,50,60],

[70,80,90]

])

For multiplication of array, see below examole

a1.dot(b1)

RUN

array([[300, 360, 420], [660, 810, 960]])

So how did we get the above output?

[1, 2, 3] [10,20,30],

[4, 5, 6] [40,50,60],

[70,80,90]

So 1x10+ 2x40 +3x70 1x20 + 2x50 + 3x80 1x30 + 2x60 + 3x90

To convert row into column and column into row

a1.tanspose()

array([[1, 4], [2, 5], [3, 6]])

c1=np.array([

[1,5,3],

[10,15,8],

[7,4,11]

])

RUN

array([[ 1, 5, 3], [10, 15, 8], [ 7, 4, 11]])

To sort row wise

np.sort(c1)

RUN

array([[ 1, 3, 5],

[ 8, 10, 15],

[ 4, 7, 11]])

To sort column wise

np.sort(c1,axis=0)

RUN

array([[ 1, 4, 3], [ 7, 5, 8], [10, 15, 11]])

r1=np.random.rand(5)

RUN

array([0.00128493, 0.76279554, 0.3523531 , 0.63897388, 0.69624613])

Now we will do Pandas

import numpy as np

import pandas as pd

a=np.array([1,2,3,4,5])

s1=pd.Series(a)

RUN

0 1

1 2

2 3

3 4

4 5

dtype: int32

in the above example, there is no index defined so default index is taken

now in the below example we will define index

index=np.array(['a','b','c','d','e'])

s2=pd.Series(a,index)

RUN

a 1

b 2

c 3

d 4

e 5

dtype: int32

s3=pd.Series([10,20,30,40],['d','e','f','g'])

RUN

d 10

e 20

f 30

g 40

dtype: int64

s2+s3

RUN

a NaN

b NaN

c NaN

d 14.0

e 25.0

f NaN

g NaN

dtype: float64

Similarly

S2-s3

S2*s3

Now we will do data frames

df1=pd.DataFrame(s2)

df1

RUN

	0
a	1
b	2
c	3
d	4
e	5

data=np.random.rand(4,4)

data

RUN

array([[0.34783327, 0.21628744, 0.72772869, 0.45423021], [0.07992632, 0.120551 , 0.81907381, 0.16060762], [0.17162294, 0.22556293, 0.0483888 , 0.22871412], [0.49538935, 0.41450558, 0.1659517 , 0.84325706]])

index=np.array(['a','b','c','d'])

index

RUN

array(['a', 'b', 'c', 'd'], dtype='<U1')

df2=pd.DataFrame(data,index)

df2

RUN

0	1	2	3
a	0.347833	0.216287	0.727729	0.454230
b	0.079926	0.120551	0.819074	0.160608
c	0.171623	0.225563	0.048389	0.228714
d	0.495389	0.414506	0.165952	0.843257

To rename the index

col=np.array(['Id','Name','City','Phone'])

col

RUN

array(['Id', 'Name', 'City', 'Phone'], dtype='<U5')

df3=pd.DataFrame(data,index,col)

df3

RUN

Id	Name	City	Phone
a	0.347833	0.216287	0.727729	0.454230
b	0.079926	0.120551	0.819074	0.160608
c	0.171623	0.225563	0.048389	0.228714
d	0.495389	0.414506	0.165952	0.843257

Knowledge Hub

Sunday, November 17, 2019

Learning Datascience -Day 2

No comments:

Post a Comment

Featured Post

Ichimoku cloud

Search This Blog