Today I defied the fears that has kept me down all this while from writing about my training journey. This is me showing up and dealing with procrastination upfront. I am very excited that I am able to finally do this.
I started taking a course on Data Science, all thanks to @IG4good and @datacamp for providing an opportunity such as this for young people like us to hone our skills and contribute our own quota in problem solving.
We were advised to take a general course, Data Science for Everyone as this will serve as a pointer for us and aid our decision making process in choosing the course to focus on.
I was able to complete about 35% of my course outline and here are a few things I learnt in the process:
DATA SCIENCE
During the course of the class, I was able to learn
- what data can do,
- Data Science work flow, and
- the various applications of Data Science.
Data science is a set of methodologies for taking in thousands of forms of data which are available to us today, and using them to draw meaningful conclusions.
THINGS DATA CAN DO: Data can be used to achieve the following;
- detect abnormal events
- diagnose the course of events and behaviors
- predict the future.
DATA SCIENCE WORK FLOW: this describes the processes involved in studying and analyzing data which includes;
- Data collection and storage
- Data preparation
- Exploration and visualization
- Experimentation
APPLICATIONS OF DATA SCIENCE
- TRADITIONAL MACHINE LEARNING
- IOT
- DEEP LEARNING
Requirements for Machine Learning:
- A well defined question.
- A set of example data.
- A new set of data to use our algorithm on.
IOT are gadgets that aren’t standard computers. Eg smart watches, internet connected home security systems, electronic toll collection systems, building energy management systems etc. It is a great resource for data science projects.
DEEP LEARNING Here multiple layers/ mini algorithms/ neurones work together to draw complex conclusions. It takes much training data and is used to solve data intensive problems.
JOBS WITHIN DATA SCIENCE
- Data engineering
- Data Analyst
- Data Scientist
- Machine Learning Scientist
Data Engineers control the flow of data. They build data pipelines and storage solutions. They design infrastructure to collect data and maintain its access. They focus on data collection and storage.
They also store and maintain data.
TOOLS
- Proficiency in SQL which they use to store and organize data.
- Java, Scala or python for processing data.
- Shell CL to automate and run tasks.
- Cloud computing for the storage of large amounts of data.
DATA ANALYSTS
They describe data by creating reports and dashboards to summarize data. This is done by cleaning data (they have less programming and more statistics experience than other roles). They focus more on Data preparation and exploration/visualization.
They also visualize and describe data.
TOOLS
- They make use of SQL to query data. They use it to retrieve and aggregate data.
- Use of spreadsheets to perform simple analysis.
- BI(Business Intelligence) tools (Tableau, Power BI, Looker) to create dashboards and visualization.
DATA SCIENTIST
- Has a strong background in statistics which enables them to find new insights from data.
Uses traditional machine learning not forecasting, Focuses on data preparation, Exploration and visualization, experimentation and prediction.
Gains insight from data.
This is a summary of what I learnt from class today. I will continue tomorrow and have a take home. I hope this will guide someone some day.
Thank you for taking out time to read my first technical writing article. This means a lot to me and my career.