Machine learning is here and there's a lot of missing steps that hinders the adoption by data engineers - and by customers. This is the first post that I will be kicking off with local environment setup and with a quick example that compares algorithms for the same dataset.
The complete path to live machine learning project
Start by the basic requirements
You should have basic knowledge using Python (used in this project is v3.6), Git and configuring Virtual Environments.
I will be using PyCharm, but any tool (including notepad) can be used to code; nevertheless, at this point the code doesn't need any change and you only need the command prompt to execute a few commands. Please install Python 3.6 and Git if you don't have installed yet.
Clone our projectThe source code that we will use to demonstrate the start of this machine learning project is here: https://github.com/valuedate/machinelearningpart1
Open a command line (in this case Windows Command Line) and type:
git clone https://github.com/valuedate/machinelearningpart1
Prepare your local environment
Open PyCharm (or equivalent) to open the cloned project (your path may differ from mine)
Go to FILE > SETTINGS
Click on PROJECT: MACHINELEARNINGPART1 > PROJECT INTERPRETER and on the top right click on and then
Choose your option; for me Virtualenv works; I only need to select NEW ENVIRONMENT, choose a location for the env (choose one that isn't you cloned directory) and the BASE INTERPRETER as Python 3.6
Open the requirements.txt file and it would appear a option to INSTALL REQUIREMENTS. Click on the option and wait untill it's completed.
Now you are ready to run your first Machine Learning Project! Press ALT+F12 or VIEW > TOOL WINDOWS > TERMINAL and enter
You are ready to start...
... defining your goal. This is crucial for measuring the final result and accuracy of your model.
- Install local environment
- Define your goal
- Prepare data
- Visualize data
- Build model
- Prepare predictions
- Productize your solution