Sunday, November 17, 2019

Learning Datascience -Day 2


day 2
import numpy as np
a=np.arange(1,11)
a
RUN
array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10])

a1=np.array([[1,2,3],[4,5,6]])
a1
RUN
array([[1, 2, 3], [4, 5, 6]])
to Sum all the elements we run the below command
a1.sum()
RUN
21
To have column wise sum of elements
a1.sum(axis=0)
RUN
array([5, 7, 9])
To sum row wise
a1.sum(axis=1)
RUN
array([ 6, 15])

b1=np.array([
    [10,20,30],
    [40,50,60],
    [70,80,90]
])
For multiplication of array, see below examole
a1.dot(b1)
RUN
array([[300, 360, 420], [660, 810, 960]])
So how did we get the above output?
   [1, 2, 3] [10,20,30],
       [4, 5, 6] [40,50,60],
                                   [70,80,90]
So 1x10+ 2x40 +3x70     1x20 + 2x50 + 3x80  1x30 + 2x60 + 3x90
To convert row into column and column into row
a1.tanspose()
array([[1, 4], [2, 5], [3, 6]])
c1=np.array([
    [1,5,3],
    [10,15,8],
    [7,4,11]
])
c1
RUN
array([[ 1, 5, 3], [10, 15, 8], [ 7, 4, 11]])

To sort row wise
np.sort(c1)
RUN
array([[ 1,  3,  5],
       [ 8, 10, 15],
       [ 4,  7, 11]])
To sort column wise
np.sort(c1,axis=0)
RUN
array([[ 1, 4, 3], [ 7, 5, 8], [10, 15, 11]])

r1=np.random.rand(5)
r1
RUN
array([0.00128493, 0.76279554, 0.3523531 , 0.63897388, 0.69624613])
Now we will do Pandas
import numpy as np
import pandas as pd
a=np.array([1,2,3,4,5])
s1=pd.Series(a)
s1
RUN
0    1
1    2
2    3
3    4
4    5
dtype: int32
in the above example, there is no index defined so default index is taken
now in the below example we will define index
index=np.array(['a','b','c','d','e'])
s2=pd.Series(a,index)
s2
RUN
a    1
b    2
c    3
d    4
e    5
dtype: int32
s3=pd.Series([10,20,30,40],['d','e','f','g'])
s3
RUN
d    10
e    20
f    30
g    40
dtype: int64
s2+s3
RUN

a     NaN
b     NaN
c     NaN
d    14.0
e    25.0
f     NaN
g     NaN
dtype: float64

Similarly
S2-s3
S2*s3

Now we will do data frames
df1=pd.DataFrame(s2)
df1
RUN
0
a
1
b
2
c
3
d
4
e
5
data=np.random.rand(4,4)
data
RUN
array([[0.34783327, 0.21628744, 0.72772869, 0.45423021], [0.07992632, 0.120551 , 0.81907381, 0.16060762], [0.17162294, 0.22556293, 0.0483888 , 0.22871412], [0.49538935, 0.41450558, 0.1659517 , 0.84325706]])
index=np.array(['a','b','c','d'])
index
RUN
array(['a', 'b', 'c', 'd'], dtype='<U1')
df2=pd.DataFrame(data,index)
df2
RUN
0
1
2
3
a
0.347833
0.216287
0.727729
0.454230
b
0.079926
0.120551
0.819074
0.160608
c
0.171623
0.225563
0.048389
0.228714
d
0.495389
0.414506
0.165952
0.843257

To rename the index
col=np.array(['Id','Name','City','Phone'])
col
RUN
array(['Id', 'Name', 'City', 'Phone'], dtype='<U5')
df3=pd.DataFrame(data,index,col)
df3
RUN
Id
Name
City
Phone
a
0.347833
0.216287
0.727729
0.454230
b
0.079926
0.120551
0.819074
0.160608
c
0.171623
0.225563
0.048389
0.228714
d
0.495389
0.414506
0.165952
0.843257

No comments:

Post a Comment

Featured Post

Ichimoku cloud

Here how you read a ichimoku cloud 1) Blue Converse line: It measures short term trend. it also shows minor support or resistance. Its ve...