1.2 Why HTAP Matters(HTAP 数据库简介)
主讲人:
Xiaoyu Ma(马晓宇)
Senior Technical Director - Real-time analytics and SQL meta
Tech lead@Quantcast / Netease / PingCAP
Big Data / Distributed database
Before we begin
- Context: As the need of real-time analytics and HTAP is rising, this topic is an introduction of the concept of HTAP.
- Goal: After this session, audience will have a brief idea of what is HTAP
- Outline:
- Database evolution
- What does term HTAP means
- Why HTAP is needed and how it helps you
- TiDB HTAP architecture
- Real world scenarios
- Lab requirements (if needed):N/A
Part I: What is HTAP
- An overview of HTAP
- Goal
- Subtopics
- What is HTAP
- AP & TP Databases
- Why HTAP and how it helps you
- Technical difficulties
- Key points
- Review of goal
What is HTAP
- Invented by Gartner
- HTAP is a very simple concept
- TP = Transactional Processing
- Row format, update in real-time
- High concurrency and consistency, touch only a few rows each time
- Current data
- AP = Analytical Processing
- Columnar format, batch update
- Low concurrency, large batch process each query
- Historical data
HTAP 是一家著名的市场分析和调研机构 Gartner 发明的词汇。
传统数据平台:
数仓中的数据更新不及时,架构复杂。
Why HTAP
The boundary between TP/AP is blurry now
- TP-ish AP use cases
- Comprehensive query platforms that provide report and high concurrent short query at the same time
- AP-ish TP use cases
- Analyze and optimize online transactional business in real time
- Real-time cross BU data services
How HTAP help you
- HTAP databases shine
- Simplify architecture
- Lower
maintenance
cost - Empower real-time scenarios
- Improve business agility
HTAP 使架构变得简单,降低运维成本,支持实时分析和决策。
案例:销售数据平台
该平台要求必须提供 TP 和 AP 两种能力。
Difficulties
- Meeting the requirements from both sides is hard
- Scalability
- It's easy to build a distributed AP database but TP is hard
- TP/AP at the same time
- Supporting both storage forms
- Avoiding workload interference
- Seamlessly integration
- Data synchronization
- Fresh data
- Scalability
Part II: How HTAP help you
- An overview of TiDB HTAP
- Goal
- Subtopics
- TiDB HTAP introduction
- Real world scenarios
- Key points
- Review of goal
TiDB HTAP
- A Scalable database
- Build for strict transactional use cases
- Proved at core finance business
- Equipped with powerful analytical engines
- Natural fit for datahub / real-time data application
What's new in TiDB 4.0 HTAP
- Real-time updatable columnar engine
- Scalable row-wise and columnar engines
- Separated machines, no interference
- Consistent data replication
- vectorized engine
- Smart selection between row and column formats
增加了一个可更新的列存引擎。
行存引擎和列存引擎可以使用不同的服务器资源,互不干扰。
行存到列存一致性复制(异步)。
优化器自动选择使用行存还是列存。
TiDB 4.0 架构:
真实案例一:TP + AP 的一站式应用
简化架构,一套系统替代两套系统,保证数据新鲜。
真实案例二:实时数仓
承载不同业务系统的数据变更,实时业务分析。
综合数据平台:
TiSpark 可以横跨多种数据平台。
欢迎来到这里!
我们正在构建一个小众社区,大家在这里相互信任,以平等 • 自由 • 奔放的价值观进行分享交流。最终,希望大家能够找到与自己志同道合的伙伴,共同成长。
注册 关于