A Step by Step Guide for Data Scientists

data science

In this step by step guide for data scientists, I am going to share the baby steps for becoming successful data scientist or machine learning developer.

Step 1: Basics of Maths

Before creation, God did just pure mathematics. Then he thought it would be pleasant change to do some applied

-John Edensor Littlewood

Data Science is game of numbers; you must be aware of following maths concepts:

  1.  Linear Algebra
  2.  Calculus
  3.  Statistics

Links:-  https://www.khanacademy.org



Step 2: Choose programming language of your comfort

you can choose between Python, R, Java, and C/C++

Links :- https://learnxinyminutes.com/


Step 3: Learn basic libraries

  1.  Numpy
  2.  Pandas
  3.  Matplotlib etc

Links :- http://www.numpy.org/



Step 4 : Data Pre-processing

Data! Data! Data!

Mostly data is of two types, Structured (tabular) and Unstructured (Text, Voice , Image).

For developing machine learning model, we largely put effort on data pre-processing rather than developing model. Approximately 80 % out of 100.

Data Processing is basically cleaning data and making it ready for model, the basic steps includes:

  1. Data Cleaning
  2. Data Integration
  3. Data Transformation
  4. Data Reduction
  5. Data Discretization




Step 5: Machine Learning Models……A big weapon (SKLEARN)

There are two types of Learning:

1.) Supervised  Learning

  •  Classification
  •  Regression

Models: – Linear regression, Multi- Linear Regression, Random Forest (CART), Support Vector Machine (SVM), Logistic Regression, Navie Bayes etc.


2.) Unsupervised Learning

  • Clustering

Models: – K-Means Clustering

links:- https://medium.com/



Step 6: Measuring Performance of the Model and Adjusting Hyper – parameters


In my next article, I will share steps for learning deep learning and natural language processing.