How to Improve Design Skills
Jeffery Yuan
April 26, 2019
Agenda
- How to Design
- System Design Principles
- Learning from Open Source
- Learning from Existing Products
- System Design Practices
How to Design
- Take time to think about your design- Minimize upfront design or YAGNI
- It doesn’t mean you don’t take time to design the component
 
- Components related
- Impact to other components
- What are alternatives?
- Welcome different approaches and discussion
How to Design
- Estimation- back-of-the-envelope calculation
 
- Estimated data size, QPS
- Take time to design data schema- As it’s difficult to change them after deploy to prod
 
- Better user experience- Thinking from client/user perspective
- How they use it, what they would like to know
 
Reflection – Lesson Learned
Reflection – Lesson Learned
- What mistakes we made- Where to store data: dynamodb or not?
- The key for Solr schema
 
- Why they happened:
- Not consider near-future requirements
- Make decisions carelessly
Reflection – Lesson Learned
- Better client library- Only contains library and code that client need
 
- Package shared configuration in the library
- Idempotent
- Policy to expire/archive data - Less data
- Optimize data for read
- Read Heavy vs Write Heavy
- Design to Be Disabled - feature toggle
- Isolate Faults - Circular breaker
- Throttling - Rate limit
- Stateless
- Asynchronous- Back pressure with exponential backoff
 
- Message queues
- Cache
- Visibility – monitoring
- Separation of concerns
- CAP
- Graceful Degradation
- Be Robust - Hide error as much as possible
- Be conservative in what you send, be liberal in what you accept
- Make your apps do something reasonable even if not all is right
- What makes them popular
- When to use them, when not
- LSM(Log Structured Merge Trees)
- SSTable
- MemTable - SSTable in memory
- How C* handles delete: Tombstone(grace period)
- Merkle trees
- Bloom Filter
- Index
- CommitLog
- Serialize cache data (row-cache, key cache) to avoid cold restart
- Session Coordinator
- Gossip protocol
- Seed nodes
- Consistent Hashing
- Eventual Consistency
- Local Index (vs Global Index)
- Why it is fast
- Sequentially read/write vs random read/write
- Memory Mapped File
- Zero copy
- Batch data(compressed)
- Partition: ordered, immutable, replicated
- Consumer group
Database
- Sharding
- Replication
- Master/Slave, Multi-master
- Twitter/FB timeline
- Pull/Push/Mixed Model
- FB Haystack/Photo storage
- URL shortener- read heavy
- able to disable write functions
 
- Design key-value store
- Crawler- Re-crawling
- cur+2t or cur+t/2 based on changed or not
 
- Design search engine- In-memory version: Data structure
- Distributed: Solr Cloud internal design
 
- Design score/rank system for social game
- Search nearby places: GeoHash
- Design Chat app
- Design logging collection and analysis system
- Design shopping cart
- Design Hit Counter
- Design rate limiter
- Design Miao Sha
Resource
- Designing Data-Intensive Applications
- Scalability Rules: Principles for Scaling Web Sites
- The Art of Scalability