How to Improve Design Skills
Jeffery Yuan
April 26, 2019
Agenda
- How to Design
- System Design Principles
- Learning from Open Source
- Learning from Existing Products
- System Design Practices
How to Design
- Take time to think about your design
- Minimize upfront design or YAGNI
- It doesn’t mean you don’t take time to design the component
- Components related
- Impact to other components
- What are alternatives?
- Welcome different approaches and discussion
How to Design
- Estimation
- back-of-the-envelope calculation
- Estimated data size, QPS
- Take time to design data schema
- As it’s difficult to change them after deploy to prod
- Better user experience
- Thinking from client/user perspective
- How they use it, what they would like to know
Reflection – Lesson Learned
Reflection – Lesson Learned
- What mistakes we made
- Where to store data: dynamodb or not?
- The key for Solr schema
- Why they happened:
- Not consider near-future requirements
- Make decisions carelessly
Reflection – Lesson Learned
- Better client library
- Only contains library and code that client need
- Package shared configuration in the library
- Idempotent
- Policy to expire/archive data - Less data
- Optimize data for read
- Read Heavy vs Write Heavy
- Design to Be Disabled - feature toggle
- Isolate Faults - Circular breaker
- Throttling - Rate limit
- Stateless
- Asynchronous
- Back pressure with exponential backoff
- Message queues
- Cache
- Visibility – monitoring
- Separation of concerns
- CAP
- Graceful Degradation
- Be Robust - Hide error as much as possible
- Be conservative in what you send, be liberal in what you accept
- Make your apps do something reasonable even if not all is right
- What makes them popular
- When to use them, when not
- LSM(Log Structured Merge Trees)
- SSTable
- MemTable - SSTable in memory
- How C* handles delete: Tombstone(grace period)
- Merkle trees
- Bloom Filter
- Index
- CommitLog
- Serialize cache data (row-cache, key cache) to avoid cold restart
- Session Coordinator
- Gossip protocol
- Seed nodes
- Consistent Hashing
- Eventual Consistency
- Local Index (vs Global Index)
- Why it is fast
- Sequentially read/write vs random read/write
- Memory Mapped File
- Zero copy
- Batch data(compressed)
- Partition: ordered, immutable, replicated
- Consumer group
Database
- Sharding
- Replication
- Master/Slave, Multi-master
- Twitter/FB timeline
- Pull/Push/Mixed Model
- FB Haystack/Photo storage
- URL shortener
- read heavy
- able to disable write functions
- Design key-value store
- Crawler
- Re-crawling
- cur+2t or cur+t/2 based on changed or not
- Design search engine
- In-memory version: Data structure
- Distributed: Solr Cloud internal design
- Design score/rank system for social game
- Search nearby places: GeoHash
- Design Chat app
- Design logging collection and analysis system
- Design shopping cart
- Design Hit Counter
- Design rate limiter
- Design Miao Sha
Resource
- Designing Data-Intensive Applications
- Scalability Rules: Principles for Scaling Web Sites
- The Art of Scalability