Understanding data and business in data science

Data science is about data and business, it is about how company exploits data to drive business decision and strategy. It is not about science, not about AI & ML and whatever deep models, which are only tools to help company mining insights from data using suitable toolkit. Business sense and capability to identify correct business problem is top-1 important step. Never mind whatever your ML model is simple decision tree, linear, boosting or fancy deep learning model, success model is that that can solve business problem and reach business success using reasonable resource and ROI. In many real business scenario, there is not big good data to let data scientist to tuning deep model, and there is constrained resource to be allocated.

Understanding business problem, transform business problem requirement to machine learning & algorithm problem, e.g. the business problem is actual a regression, or cluster or classification problem. Then understanding which type of data and how to collect data in the company’s info-structure for the problem. Then clean data, feature engineering and preliminary data analysis are must need. After data analysis, maybe it is found some insightful feature, which significantly improves model performance. It is not a good practice that blindly try some fancy model. Deep model cannot do everything for you. Traditional model can solve most business problem in business driven company. In some companies, you will find available resource do not allow you using fancy latest model because of investment too much but benefit not much attractive.

Above is personal thinking, welcome comment and discussion.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s