valuedate.io

Machine learning is here and there's a lot of missing steps that hinders the adoption by data engineers - and by customers. This is the first post that I will be kicking off with local environment setup and with a quick example that compares algorithms for the same dataset.

The complete path to live machine learning project

Start by the basic requirements

You should have basic knowledge using Python (used in this project is v3.6), Git and configuring Virtual Environments.

I will be using PyCharm, but any tool (including notepad) can be used to code; nevertheless, at this point the code doesn't need any change and you only need the command prompt to execute a few commands. Please install Python 3.6 and Git if you don't have installed yet.

Clone our project

The source code that we will use to demonstrate the start of this machine learning project is here: https://github.com/valuedate/machinelearningpart1

Open a command line (in this case Windows Command Line) and type:

                      git clone https://github.com/valuedate/machinelearningpart1

Prepare your local environment

Open PyCharm (or equivalent) to open the cloned project (your path may differ from mine)

Go to FILE > SETTINGS

Click on PROJECT: MACHINELEARNINGPART1 > PROJECT INTERPRETER and on the top right click on and then

Choose your option; for me Virtualenv works; I only need to select NEW ENVIRONMENT, choose a location for the env (choose one that isn't you cloned directory) and the BASE INTERPRETER as Python 3.6

Open the requirements.txt file and it would appear a option to INSTALL REQUIREMENTS. Click on the option and wait untill it's completed.

Now you are ready to run your first Machine Learning Project! Press ALT+F12 or VIEW > TOOL WINDOWS > TERMINAL and enter

                  python intro.py

You are ready to start...

... defining your goal. This is crucial for measuring the final result and accuracy of your model.

Install local environment
Define your goal
Prepare data
Visualize data
Build model
Prepare predictions
Productize your solution

In the next post I will be preparing data for analysis.

Data Science Machine Learning Algorythms Python LogisticRegression KNeighborsClassifier DecisionTreeClassifier GaussianNB SVC Value Date We Handle Data

Talk with us

Do you want to find out more about this? Contact us to know what you might be missing out.

Find out more