As this topic came up a few times this week for discussion at various places, I thought of composing a post on Data Scientist vs. Data Analytics Engineer; even though this is not in the list of TODO blog posts.
Data Science:
My personal understanding of Data Science (DS):
One who understands the data and business logic and provides predictions by sampling the current business data (also known as data insights / business insights / data discovery / business discovery); about the direction in which the business is heading (both good and bad) or where to head by spotting the trends; so that the business can take a right decision on their next steps; such as:
- improving the product/feature based on user interest levels
- driving more users
- driving more clicks/ impressions / conversion / revenue / leads
- user experience
- recommendation
- user retention
In general, Data Science is driven by Data Scientists; PhD in math, physics, statistics, machine learning or even computer science; Without a PhD in one of these areas, It is unlikely that one can be hired. In one of the recent ACM conferences, a leading online bidding data science hiring manager said in the open Q&A that she can™t hire anyone without a PhD (+ experience).
Data Scientist Qualifications:
- Familiar on how to use database systems (SQL interface, ad-hoc) esp. MySQL and Hive (at-least) to begin with
- Java / python / simple map-reduce jobs development, if needed
- Exposure to various analytics functions (over, median, rank, etc.) and how to use them on various data sets
- Mathematics, Statistics, Correlation, Data mining and Predictive analytics (fast to future prediction based on probability & correlation)
- R and/or RStudio (optionally excel, SAS, IBM SPSS, MATLAB)
- Deep insights into (statistical ) data model development (in agile fashion) and in-general self learning model is the best in today™s dynamics; so that it can learn and tune from its own output by combining with performance over the period of time
- Work with (very) large data sets, grouping together various data sets and visualizing them
- Familiar with machine learning and/or data mining algorithms (Mahout, Bayesian, Clustering, etc.)
As there are different qualifications and expertize within data science, one needs to pick the right candidate for the type of role they going to play. For example, if you have a natural language processing (NLP) role; then you may need a different set of skills to match that role. At times, it also depends on the team size; one can be jack of all trades or the roles could be split among multiple teams.
At present, there is a lot of demand for Data Scientists in the market; probably one of the leading job roles after Data Analytics. Here is the trend for Data Science from indeed:
Data Analytics:
Data Analytics (DA) in general is a logical extension (or just a buzz word) to Data Warehousing(DW), Business Intelligence (BI); which provides complete insights into business data in most usable form. The major difference in warehousing to analytics is, analytics can be real-time and dynamic in most cases; where as warehouse is ETL driven in off-line fashion.
Every business who deals with data, must have Data Analytics; without analytics in-place; the business is treated as dead man walking without a heart, a soul and a mind.
Data Analytics (Engineer) Qualifications:
- Familiar with data warehousing and business intelligence concepts
- Strong in-depth exposure to SQL and analytic solutions
- Exposure to hadoop platform based analytics solution (HBase, Hive, Map-reduce jobs, Impala, Cascading, etc.)
- Exposure to various enterprise commercial data analytical stores (Vertica, Greenplum, Aster Data, Teradata, Netezza, etc.) esp. on how to store/retrieve data in most efficient manner from these stores.
- Familiar with various ETL tools (especially for transforming different sources of data into analytics data stores), if needed able to make everything (or some critical business features) real-time
- Schema design for storing and retrieving data efficiently
- Familiar with various tools and components in the data architecture
- Decision making skills (real-time vs ETL, using X component instead of Y for implementing Z etc.)
Sometimes, A Data Analytics Engineer also plays the role of data mining on demand as needed; as he has a better understanding of the data than anyone else; and in-general they have to work closely to get better results.
Data Analytics can also be divided or shared between 4 different teams or people (as it is hard to hire a person with a complete skill-set and more over administration is different from development).
- data architect
- database administrator
- analytics engineer and
- operations
At present, Data Analytics is probably one of the hot jobs (may be Hadoop/Big Data Engineer has taken over by now); Here is the trend for Data Analytics from indeed; and it may continue to be hot for a while; as most business needs to have data analytics in place.
Even though both Data Science and Data Analytics look similar in terms of technology domain; but data science is a data consumer within the business unit and solely depends on data provided by data analytics team. More than that; most of the model predictions or algorithms works really well on large data sets due to better probability on bigger data sets ; so the bigger the data; you have much better chance to predict it right and drive the business further; which means both are directly depending on each other. If you have an engineer with both the qualifications, then he can play everything.
Academy: How to Become Data Scientist or Data Analytics Engineer
- Most of the higher degree institutions in US now offers Data Science and Data Analytics as courses including popular institutions like Berkeley, Stanford, Columbia, Harvard etc.
- Here is a reference link on colleges offering these subjects as courses (may not be complete & accurate, better check the institute directly):
- Here is one more online (instructor-led or video) on the same topic (which covers pretty much everything needed for today™s world):
Data Science“ Books
- Here is a nice blog post from Carl Anderson on all freely available data science books and materials
Great summary.
New #mysql planet post : Data Science vs. Data Analytics https://t.co/rwiVsYKF
Thanks for this summary. Great job!
New blog post: Data Science vs. Data Analytics – https://t.co/YNidGS1c
Data Science vs. Data Analytics | Venu Anuganti Blog https://t.co/S96dndos
Data Science vs. Data Analytics https://t.co/nRZHPJdJ
just have finished reading / Data Science vs. Data Analytics (via @Pocket) B! https://t.co/I7YIjVnQ
Venu Anuganti Blog » Data Science vs. Data Analytics https://t.co/SC2smFcv
Data Science vs. Data Analytics https://t.co/r3LGEADN
Data Science vs. Data Analytics https://t.co/6F0SEVYS #in
Data Science vs. Data Analytics https://t.co/WKKAkDUP
"Data Science vs. Data Analytics" https://t.co/cyyFArdU – recommended via @Prismatic
RT @vanuganti: Data Science vs. Data Analytics https://t.co/y6whAekz
æ‚…
Data.Science vs Data.Analytics…
Data Science vs. Data Analytics https://t.co/A0DMvy5R https://t.co/UW7t8YOw
Interesante: Data Science vs. Data Analytics https://t.co/qXKgBvGc #li #BigData
RT @vanuganti: Data Science vs. Data Analytics https://t.co/xWr2jB09
Data Science vs. Data Analytics https://t.co/A0DMvy5R https://t.co/UW7t8YOw
Interesante: Data Science vs. Data Analytics https://t.co/qXKgBvGc #li #BigData
Fun article for data geeks… Data Science vs. Data Analytics https://t.co/N3pm5fkU via @prismatic
RT @vanuganti: Data Science vs. Data Analytics https://t.co/SUY8q5af
Are you really a data scientist or data analyst? https://t.co/M6TTskS7
RT @vanuganti: Data Science vs. Data Analytics https://t.co/st4UbAKa
RT @vanuganti: Data Science vs. Data Analytics https://t.co/uGjVijoi
Data Science vs. Data Analytics https://t.co/vU3Z965K I wish I had a better gpa to go for a phd, I want to build model😊😊ðŸ‘ðŸ‘
RT @vanuganti: Data Science vs. Data Analytics https://t.co/atboXIBe
RT @vanuganti: Data Science vs. Data Analytics https://t.co/JJEBaNsM
Data Science vs. Data Analytics https://t.co/fHrVYtXW
#DataScience vs. #Analytics – Scientist or Engineer – which are you? and how do you become one? https://t.co/MQ0MAx16 via @vanuganti #BigData
#DataScience vs. #Analytics – Scientist or Engineer – which are you? and how do you become one? https://t.co/MQ0MAx16 via @vanuganti #BigData
#DataScience vs. #Analytics – Scientist or Engineer – which are you? and how do you become one? https://t.co/MQ0MAx16 via @vanuganti #BigData
How many years of experience would be enough so that a candidate would not need a Ph.D.? 10? 20? 30?
RT @vanuganti: Data Science vs. Data Analytics https://t.co/DucxGTgD – data scientists are more expensive and specialized, but well worth it
Good to see someone try to clarify Data Science vs. Data Analytics
https://t.co/lHF6gNJk
#techie #tuesday: Venu Anuganti Blog » Data Science vs. Data Analytics https://t.co/GEeh6rp5 ^jc
Phil, it depends on individual hiring managers. In general, as long as you can prove with few models or agree to yield results on A/B test with few users based on some commitment before taking up the full-time role. PhD is not a mandatory, but 95% of the time managers will look for it or at-least you should have some track record of success with few models.
.@vanuganti distinguishes #datascience from data #analytics https://t.co/XjnLszYK
.@vanuganti distinguishes #datascience from data #analytics https://t.co/XjnLszYK
.@vanuganti distinguishes #datascience from data #analytics https://t.co/XjnLszYK
Data Science vs. Data Analytics https://t.co/nQJrAKHl
RT @vanuganti: Data Science vs. Data Analytics https://t.co/eJFnhFPl
Are you really a #DataScientist or a Data Analyst? https://t.co/MQ0MAx16 >..or a Data #Analytics Engineer? Is your focus discovery & insight?
Are you really a #DataScientist or a Data Analyst? https://t.co/MQ0MAx16 >..or a Data #Analytics Engineer? Is your focus discovery & insight?
Are you really a #DataScientist or a Data Analyst? https://t.co/MQ0MAx16 >..or a Data #Analytics Engineer? Is your focus discovery & insight?
RT @KirkDBorne: Are you really a #DataScientist or a Data Analyst? https://t.co/IeufF4MG >..or a Data #Analytics Engineer?
Data Science vs. #Data #Analytics – https://t.co/FxDnPDmQ
Data Science vs. Data Analytics https://t.co/rDGRVPqU by @vanuganti
Data Science vs. Data Analytics https://t.co/bkqhiaB9
Data Science vs. Data Analytics https://t.co/bkqhiaB9
Data Science vs. Data Analytics https://t.co/bkqhiaB9
RT @vanuganti: Data Science vs. Data Analytics https://t.co/mbO8KQU9
RT @vanuganti: Data Science vs. Data Analytics https://t.co/mbO8KQU9
“@ae_romero: RT @vanuganti: Data Science vs. Data Analytics https://t.co/lGxWlFBp
#Analytics #datascience
Venu Anuganti Blog » Data Science vs. Data Analytics https://t.co/KaDywqaO
Data Science vs. Data Analytics https://t.co/284JVuKj
Data Science vs. Data Analytics https://t.co/IIrsfEHb
Data Science vs. Data Analytics https://t.co/FbdvN6qz https://t.co/YFeTKoPD
[…] See Venu’s whole article here. […]
[…] (Predictive Analysis Library); which is basically written using SQLScript; which can ease the development cycles for data scientists (support for clustering and classification is good enough for the […]
Data Science vs. Data Analytics – https://t.co/iL3x25clcX https://t.co/iL3x25clcX
Data Science vs. Data Analytics https://t.co/2NQOENtmTK
With a Christmas Gift Card, it’s not only cheap it also allows
the person you’re giving it to to buy whatever they want and that alone is a godsend for me.
Instead, she might appreciate a $5 gift card to the local coffee shop inside a refillable coffee mug – something she will
enjoy on cold winter mornings driving a bus full of children around
town. The choice of materials would depend on your chosen design.
People often stay away from the portable head unit fearing that they’ll lack the audio quality.
Display those speakers as outlined by where they’ll need to go in the real environment,
look for a CD or DVD and present it a try. A handy tip,
build a code and marking system to distinguish your wires and all of one’s components before you start laying things down.
Data Science vs. Data Analytics: https://t.co/GcUZRTEd3A
Data Science vs Data Analytics https://t.co/AARJ16VhdQ