Here is the typical œBig data architecture, that covers most components involved in the data pipeline. More or less, we have the same architecture in production in number of places (with some varying components due to data sources and data consumption is varied from company to company).
“Big” Data Architecture
Any data architecture loosely consists of four major logical components:
1. Data Source
True source of data coming from heterogeneous data sources. This is typically your data stores (SQL or NoSQL) that gives a structured data or any other data coming through APIs or other means (semi-structured or un-structured).
-
Data from SQL, NoSQL stores (MySQL, Oracle, PostgreSQL, MongoDB, etc. “ Mostly structured)
-
(Semi/Un)-structured data (CRM, marketing, campaign, spend, revenue, leads, etc.)
-
Web logs or other log files (weblogs, user clicks, user visits, activity, etc.)
2. Data Transformation
Transformation of data from one form to another, its either part of ETL (Extract, Transform and Load) or import/export tools and/or scripts. Mainly used to load all sources of data into data processing pipeline.
Log management tools can also be considered as part of ETL, as they generate useful events from log files and present dashboard with alerting system in place or they can be directly loaded into data processing stores.
-
ETL, ELTL tools (bash/python/perl/java scripts, Business Objects, SSIS, Kettle, etc.)
-
SQOOP (Data Source to Hadoop data transformation tool, JDBC compatible)
-
Import/Export tool (SQL/NoSQL vendor specific tools)
-
Log Management tools (Splunk, Syslog, Custom log filter scripts, flume, scribe, loggly, etc.)
3. Data Processing or Data Integration
This is yet another source of vast data by combining both structured and un-structured data in one place (either real-time or incremental loading); mainly for data processing (Data Warehousing or Analytics) and generate usable data (materialized or aggregated) that can be consumed by data consumption components.
-
Hadoop and Ecosystem (Hadoop/HDFS, Map-reduce, HBase, Hive, Impala, Pig etc) “ uses HDFS as native storage
-
Data Warehouse and Analytics solution (MySQL, SQL Server, Vertica, Green Plum, Aster Data, Exadata, SAP HANA, IBM Netezza, IBM Pure Data, Tera Data, etc.) “ Uses vendor specific storage, optionally uses HDFS, even though with degraded performance.
-
In-memory Analytics (SAS, Kognitio, Druid, etc.). This is an emerging market and trying to take advantage by reading directly from HDFS. We will see lot of in-memory analytics in coming days.
4. Data Consumption
Data consumption components that either consumes or exposes the data in usable form to end users or to other layers internally (ad-hoc) or externally (using APIs)
-
Reporting (custom dashboards, micro strategy, pentaho, business objects, cognos, hyperion, tableau, etc.)
-
Search or Discovery (solr, elastic search, tibco spotfire, datameer etc.)
-
Data Science, Mining and Analysis (mainly for internal data analysis to predict or estimate the overall performance and also drive recommendation using set of algorithms, user defined map-reduce jobs or ad-hoc queries)
Apart from the four logical components, monitoring plays a crucial role in detecting any failure within the data pipeline along with threshold changes to identify any bottlenecks in terms of performance, scalability and overall throughput.
New blog post: Typical “Big†Data Architecture – https://t.co/i4rJYshu
New #mysql planet post : Typical “Big†Data Architecture https://t.co/jqjqKaPy
Really typical but good summarized / Typical “Big†Data Architecture B! https://t.co/AgNBRpZb
Typical “Big†Data Architecture https://t.co/B6s0JzSt via @zite
Typical Big Data Architecture – https://t.co/MlOaavi4
Typical “Big†Data Architecture https://t.co/ecBtcaiC via @zite
Venu blogs a “typical” data architecture. He missed Memcached and storage later, but its too real and ugly: https://t.co/kXtKq5MW
Typical “Big†Data Architecture https://t.co/ZOoAEPpY
Typical “Big†Data Architecture https://t.co/j6yNm9vA via @zite
Typical “Big†Data Architecture https://t.co/ZQdpwEHL via @zite
Typical “Big†Data Architecture https://t.co/sWpjusiw via @prismatic
Typical “Big†Data Architecture « Venu Anuganti Blog https://t.co/kcmmE4h2 @stephdokin @vishaltx
Simple and to the point RT @bobehayes: Typical “Big†Data Architecture « Venu Anuganti Blog https://t.co/8NsKNS0k @stephdokin @vishaltx
Typical “Big†Data Architecture https://t.co/KSuDYSA8
“Typical “Big†Data Architecture” https://t.co/InCbOB5s – recommended via @Prismatic
Venu Anuganti Blog Typical Big Data Architecture https://t.co/0VS42FCc #BigData
Typical “Big†Data Architecture https://t.co/ouhZSoMh via @prismatic
Venu Anuganti Blog » Typical “Big†Data Architecture: https://t.co/7ENDcjt7
Where is the support for unstructured data? The architecture is missing support for natural language processing and machine learning approaches to detect patterns in human languages. Not everything fits into a table. ETLs can’t accurate parse language into tabular data.
Typical “Big†Data Architecture https://t.co/IX1WPjTX via @zite #bigdata
Typical “Big†Data Architecture https://t.co/Tgv3XSwo #bigdata
Thanks Olin. NLP and Text mining is something that is missing (esp on how to extract); I will probably cover that as separate topic in coming days as it has its own significance.
Include #Pentaho Yeah! RT @imbigdata: Venu Anuganti Blog Typical Big Data Architecture https://t.co/CavgPvdM #BigData
Typical “Big†Data Architecture https://t.co/XiyDoknB #BigData
Typical “Big†Data Architecture https://t.co/fFWMzL1y via @zite
Typical “Big†Data Architecture https://t.co/1KZRVU9t @nilo83link @truccomario @marco_gallinari please warm up your keyboard!
Typical “Big†Data Architecture. https://t.co/wvqV8yig
MT @stephdokin @bobehayes: Typical #BigData Architecture https://t.co/JLYtBftz @stephdokin @vishaltx vÃa @BigDataClub
Typical BIG Data Architecture – https://t.co/mDW39Tmr
Simple and to the point RT @bobehayes: Typical “Big†Data Architecture « Venu Anuganti Blog https://t.co/8NsKNS0k @stephdokin @vishaltx
Simple and to the point RT @bobehayes: Typical “Big†Data Architecture « Venu Anuganti Blog https://t.co/8NsKNS0k @stephdokin @vishaltx
Typical “Big†Data Architecture https://t.co/fFWMzL1y via @zite
Simple and to the point RT @bobehayes: Typical “Big†Data Architecture « Venu Anuganti Blog https://t.co/8NsKNS0k @stephdokin @vishaltx
Typical “Big†Data Architecture https://t.co/1KZRVU9t @nilo83link @truccomario @marco_gallinari please warm up your keyboard!
Typical “Big†Data Architecture. https://t.co/wvqV8yig
Typical BIG Data Architecture – https://t.co/mDW39Tmr
Typical “Big†Data Architecture https://t.co/uQ9UJypR
Typical “Big†Data Architecture https://t.co/WfCnyFqC via @prismatic
https://t.co/zZWpm5fm Typical “Big†data architecture, that covers mo… https://t.co/rA95gAnH
#Fail: This typical “Big†data architecture is missing everything unstructured. Big Data requires machine learning. https://t.co/2io7PjHK
#Fail: This typical “Big†data architecture is missing everything unstructured. Big Data requires machine learning. https://t.co/2io7PjHK
#Fail: This typical “Big†data architecture is missing everything unstructured. Big Data requires machine learning. https://t.co/2io7PjHK
Typical “Big†Data Architecture https://t.co/uQ9UJypR
Typical “Big†Data Architecture https://t.co/WfCnyFqC via @prismatic
https://t.co/zZWpm5fm Typical “Big†data architecture, that covers mo… https://t.co/rA95gAnH
Typical “Big†Data Architecture https://t.co/7E2kiANL via #bigdata
surprised to see about 11 emails so far in my inbox about missing NLP component from this blog post https://t.co/Psw6UAJb
Typical “Big†Data Architecture https://t.co/OY9WcifT
#iaflash Typical “Big†Data Architecture https://t.co/HrAIHrg2 https://t.co/CupBjopd
Typical “Big†Data Architecture https://t.co/J3skyrPU via @zite
Typical “Big†Data Architecture https://t.co/kY2LoJiG
Typical “Big†Data Architecture https://t.co/fFWMzL1y via @zite
Typical “Big†Data Architecture https://t.co/J3skyrPU via @zite
Typical “Big†Data Architecture https://t.co/kY2LoJiG
Typical #bigdata #architecture – https://t.co/KS7uniYQ #bbdd #web
Typical “Big†Data Architecture https://t.co/nEQprNSG via @zite
Typical #bigdata #architecture – https://t.co/KS7uniYQ #bbdd #web
Typical “Big†Data Architecture https://t.co/k8iD478G
Typical “Big†Data Architecture https://t.co/TSrdkZg4 me encanta ese gráfico
Typical “Big†Data Architecture https://t.co/mXFTywST via @zite
Thanks for such a useful article the database is always my favourite subject and you explained the things well.
Venue Anuganti's typical big data architecture: https://t.co/xMJTOxRh
Typical “Big†Data Architecture https://t.co/NQ1XJPuW via @zite
[…] Muzammil: Thanks for such a useful artic… […]
RT @xguru: Typical “Big” Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
RT @xguru: Typical “Big” Data Architecture https://t.co/X8vZqfUX 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
[…] Familiar with various tools and components in the data architecture […]
RT @vanuganti: Typical “Big” Data Architecture https://t.co/U7lXNSms
RT @xguru: Typical “Big” Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
[…] data architecture, that covers most components involved in the data pipeline.See on venublog.com ì´ê²ƒì´ 좋아요:좋아하기Be the first to like […]
RT @xguru: Typical “Big” Data Architecture https://t.co/r4YgH8Zq 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
Typical "Big" Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
Typical "Big" Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
Typical "Big" Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
Typical "Big" Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
RT @xguru: Typical "Big" Data Architecture https://t.co/X8vZqfUX 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
Typical "Big" Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
Typical "Big" Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
Typical "Big" Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
Typical "Big" Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
RT @xguru: Typical "Big" Data Architecture https://t.co/r4YgH8Zq 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
Typical "Big" Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
readkev: https://t.co/4Xh2pPbK Typical “Big†data architecture, that … https://t.co/5ArGg64K
Typical "Big" Data Architecture https://t.co/yXgad9qI 너무 간략화 ëœê²ƒ 같기는 í•œë°.. 그림한장과 함께 짧ì€ê¸€ë¡œ ì •ë¦¬ë˜ì–´ì„œ 좋네요 😉
Venu Anuganti Blog » Typical “Big†Data Architecture https://t.co/LyuwIkD5
RT @vanuganti: Typical “Big” Data Architecture https://t.co/MXcXjSoo
RT @vanuganti: Typical Big Data Architecture https://t.co/MXcXjSoo
Typical “Big†Data Architecture https://t.co/JNTRrZj7
[…] The “typical” Big Data Architecture model above comes from one of my favorite sources of inspiration the Venu Anuganti Blog. […]
[…] for all log events in big data analytics; which will avoid log processing needs as described in bigdata architecture and more than that, splunk only charges for storage and not for API […]
[…] Article original : ici […]
Typical ldquo;Bigrdquo; Data Architecture – https://t.co/iptCtu1Fgx via @vanuganti
Typical “Big†Data Architecture https://t.co/n7gbpOd3Rl (via @vanuganti)
Typical “Big†Data Architecture https://t.co/NkUmfO6LZr
Typical Big Data architecture https://t.co/j1zOl6HZXS
Not sure there is really a typical big data architecture, but but good example anyway. https://t.co/SKzXxtFdeM
Typical ldquo;Bigrdquo; Data Architecture – https://t.co/75Hrcy8rcN @vanugantiã•ã‚“ã‹ã‚‰
RT @jameskobielus: “Typical “Big†Data Architecture” (https://t.co/kYyXVbfbAj) JK–I’d put distancing quotes around “typical.”< my head hurts
RT @jameskobielus: “Typical “Big†Data Architecture” (https://t.co/sVRQX9ONMt) JK–I’d put distancing quotes around “typical.” There’s no typical architecture
Typical “Big†Data Architecture https://t.co/ER6mbRPzUj #bigdata
Typical “Big†Data Architecture | Venu Anuganti Blog – https://t.co/prmN0knSXI
Typical “Big†Data Architecture https://t.co/v5qynqHRwB
RT @OnSoftware: Typical “Big†Data Architecture https://t.co/ER6mbRPzUj #bigdata
Typical “Big†Data Architecture https://t.co/UH8k2Rueps via @prismatic #GoodRead
nice high-level big data infra:
https://t.co/QLFEekoBhJ
Typical Big Data Architecture – https://t.co/XPhKTTxBmz via @vanuganti #bigdata
“Typical “Big†Data Architecture” https://t.co/UcWGmI7u02 – read via @Prismatic
Typical “Big Data†architecture: https://t.co/bpGKFMciXS
Typical “Big†Data Architecture https://t.co/7csGPX0HDi via @prismatic
[…] Hana eventually becomes the core of SAP’s software as a service (SaaS) and more than that, the solution can replace need for many components like OLTP, NoSQL, ETL, Warehouse, Datamart and OLAP in the typical (big) data architecture. […]
Typical “Big†Data Architecture
https://t.co/L2wKtVzrGE
[…] opportunities for learning about their customers and will have difficulties to understanding the required architectures to support business […]
[…] opportunities for learning about their customers and will have difficulties to understanding the required architectures to support business […]
@T_Zano @kdnuggets Re: #BigData Tools > This is best: https://t.co/PiBS9uFczh …plus:
2)https://t.co/4D1Ctu2ctb
3)https://t.co/eYJZq3RLni
RT @KirkDBorne: @T_Zano @kdnuggets Re: #BigData Tools > This is best: https://t.co/PiBS9uFczh …plus:
2)https://t.co/4D1Ctu2ctb
3)http://t.…
[…] Here is the typical “Big” data architecture, that covers most components involved in the data pipeline. More or less, we have the same architecture in production in number of places[…] […]
Typical “Big†Data Architecture | Venu Anuganti Blog https://t.co/JJT1qzx4t5
#bigdata : Typical “Big†Data #Architecture: https://t.co/wci71Y1vUI https://t.co/34Q38EEy8V
#Batman : Typical “Big†Data #Architecture: https://t.co/TZM2CgX9PW https://t.co/uLn9Y3TqFA