A Step by Step Guide for Data Scientists

data science

In this step by step guide for data scientists, I am going to share the baby steps for becoming successful data scientist or machine learning developer.

Step 1: Basics of Maths

Before creation, God did just pure mathematics. Then he thought it would be pleasant change to do some applied

-John Edensor Littlewood

Data Science is game of numbers; you must be aware of following maths concepts:

  1.  Linear Algebra
  2.  Calculus
  3.  Statistics

Links:-  https://www.khanacademy.org

https://ocw.mit.edu/courses/mathematics/18-06-linear-algebra-spring-2010/

 

Step 2: Choose programming language of your comfort

you can choose between Python, R, Java, and C/C++

Links :- https://learnxinyminutes.com/

 

Step 3: Learn basic libraries

  1.  Numpy
  2.  Pandas
  3.  Matplotlib etc

Links :- http://www.numpy.org/

https://pandas.pydata.org/

 

Step 4 : Data Pre-processing

Data! Data! Data!

Mostly data is of two types, Structured (tabular) and Unstructured (Text, Voice , Image).

For developing machine learning model, we largely put effort on data pre-processing rather than developing model. Approximately 80 % out of 100.

Data Processing is basically cleaning data and making it ready for model, the basic steps includes:

  1. Data Cleaning
  2. Data Integration
  3. Data Transformation
  4. Data Reduction
  5. Data Discretization

links:-

https://towardsdatascience.com/data-pre-processing-techniques-you-should-know-8954662716d6

 

Step 5: Machine Learning Models……A big weapon (SKLEARN)

There are two types of Learning:

1.) Supervised  Learning

  •  Classification
  •  Regression

Models: – Linear regression, Multi- Linear Regression, Random Forest (CART), Support Vector Machine (SVM), Logistic Regression, Navie Bayes etc.

 

2.) Unsupervised Learning

  • Clustering

Models: – K-Means Clustering

links:- https://medium.com/

https://www.analyticsvidhya.com/

 

Step 6: Measuring Performance of the Model and Adjusting Hyper – parameters

 

In my next article, I will share steps for learning deep learning and natural language processing.

Links:-

https://towardsdatascience.com/metrics-to-evaluate-your-machine-learning-algorithm-f10ba6e38234