What subjects should you learn?
- Probability and statistics
- Excel / SQL / Databases
- Programming basics
- Data visualization
Which programming language should you use?
- Python — Most popular for data analysis, easy to learn, rich ecosystem.
- R — Widely used in statistics and academia.
Where to find courses?
Recommended Courses
Programming Basics
- Python Programming (Chinese, MOOC)
- 2022 Complete Python Bootcamp From Zero to Hero in Python (Udemy)
SQL and Databases
Data Visualization
- Data Visualization—Python Applications (Chinese, MOOC)
- Accounting Data Analytics Specialization (Coursera)
- Business Data Analysis and Applications (Chinese, MOOC)
- Python Data Analysis and Visualization (Chinese, MOOC)
Books
- Head First Python
- MySQL Basics
- Data Preprocessing from Beginner to Practice: Based on SQL, R, Python
- Python Data Analysis Practice (2nd Edition)
- Python Crash Course (Turing)
- Excel + Python: Fast Data Analysis and Processing
For buying books, consider Kongfz Used Books for affordable second-hand options—most are nearly new but much cheaper.
Tools
- VS Code — Code editor
- Git — Version control
- Linux — Operating system for data work
- Shell Programming — Useful for automation and data processing
Additional Tips
- Practice with real datasets (e.g., Kaggle).
- Learn basic data cleaning and preprocessing.
- Understand basic data visualization principles (matplotlib, seaborn, Excel charts).
- Familiarize yourself with Jupyter Notebook for interactive analysis.
Summary
Recommended learning order:
Python, Git, Shell → SQL → Data Visualization
Start with programming basics, then move to databases and data visualization. Practice regularly and build small projects to reinforce your skills.