📖 একটি ছোট গল্প
একজন AI Engineer (industry) vs AI Researcher (academia) — দুজনের math requirement আলাদা। Industry-তে "কীভাবে implement করব" জানা দরকার, research-তে "কেন কাজ করে" prove করতে হয়। এই অধ্যায়ে দুটি path-এর জন্য math roadmap, interview focus area, এবং continuous learning strategy দেখব।
AI Engineer Path (Industry)
Goal: Production model build, optimize, deploy — correctness ও efficiency balance।
Core Math (Must-Know)
- Linear Algebra — matrix ops, broadcasting, shape reasoning (debugging-এ daily use)।
- Calculus — gradient intuition, chain rule, backprop trace করতে পারা।
- Probability — distributions, expectation, Bayesian thinking (uncertainty quantification)।
- Optimization — Adam, learning rate, batch size, convergence diagnosis।
Nice-to-Have (Deeper Impact)
- Information Theory — cross-entropy, KL divergence (loss design, distillation)।
- Numerical Analysis — stability, mixed precision, gradient scaling (production debugging)।
- Graph Theory — GNN applications (recommendation, fraud detection)।
AI Researcher Path (Academia/Lab)
Goal: New method invent, theoretical guarantee provide, publish।
Core Math (Must-Know Deeply)
- Matrix Calculus — vector-Jacobian products, trace identities, Hessian analysis (CH 48)।
- Probability & Measure Theory — convergence, PAC bounds, concentration inequalities।
- Optimization Theory — convex analysis, convergence rates, second-order methods (CH 49)।
- Statistical Learning Theory — Rademacher complexity, VC dimension, generalization bounds।
Specialized Depth
- Variational Inference — ELBO, reparameterization, normalizing flows (generative models)।
- Differential Geometry / Topology — manifold learning, latent space structure।
- Stochastic Processes — SDEs (diffusion models), Markov chain mixing time।
Interview Math Focus by Role
MLE / AI Engineer (Applied)
- Shape debugging: X \in \mathbb{R}^{N \times D}, batch norm-এর dimension কত?
- Complexity analysis: self-attention O(n^2 d) vs linear attention O(nd^2) — কখন কোনটা better?
- Gradient behavior: vanishing gradient কীভাবে diagnose করবেন? (gradient norm plot)।
- Loss landscape: training loss oscillate করছে — LR বাড়াবেন না কমাবেন?
Research Scientist / PhD
- Prove that SGD converges for convex Lipschitz functions — rate কত?
- Attention mechanism-এর expressive power — কীভাবে prove করবেন?
- Generalization bound: model complexity vs dataset size — relationship?
- Novel architecture design — inductive bias কী? কেন কাজ করবে theoretically?
Structured Learning Plan
- Month 1-2: Phase 0-2 (Foundation + Linear Algebra + Calculus) — solid base গড়ুন।
- Month 3-4: Phase 3-4 (Probability + Optimization) — ML theory understand করুন।
- Month 5-6: Phase 5 (Deep Learning Math) — architecture গুলোর গণিত বোঝুন।
- Month 7-8: Phase 6-7 (Advanced) — paper পড়া শুরু, citation trace করুন।
- Ongoing: Phase 8-9 (Applied + Career) — real system-এ apply, interview practice।
Recommended Resources
- Books: "Mathematics for Machine Learning" (Deisenroth et al.) — free, comprehensive।
- Books: "Deep Learning" (Goodfellow et al.) — Ch 2-5 = math foundation, Ch 6-12 = DL math।
- Books: "Probabilistic Machine Learning" (Murphy) — Bayesian, PGM, inference।
- Courses: Stanford CS229 (ML theory), CS231n (CNNs), CS224n (NLP), CS234 (RL)।
- YouTube: 3Blue1Brown (Linear Algebra, Calculus), Mathematical Monk (Probability, ML)।
- Blogs: Distill.pub (interactive explanations), Lilian Weng (technical deep dives)।
Python: Skill Assessment Checklist
pythonPython · NumPy
# Self-assessment: can you derive/implement these from scratch?
checklist = {
"Linear Algebra": [
"Matrix multiplication (no library)",
"Eigenvalue decomposition",
"SVD for dimensionality reduction",
"Batch matrix ops shape reasoning"
],
"Calculus": [
"Chain rule for neural network",
"Gradient of matrix loss w.r.t. weights",
"Hessian-vector product",
"Numerical gradient check"
],
"Probability": [
"MLE for Gaussian parameters",
"Bayesian posterior update",
"KL divergence between two Gaussians",
"Monte Carlo integration"
],
"Optimization": [
"SGD with momentum",
"Adam update rule",
"Learning rate schedule (cosine)",
"Convergence rate analysis (simple case)"
],
"Deep Learning": [
"Manual backprop for MLP",
"CNN forward/backward pass",
"Attention mechanism from scratch",
"Loss function design with math"
]
}
for category, items in checklist.items():
print(f"\n{category}:")
for item in items:
print(f" [ ] {item}")Practice Tasks
- নিজের current role-এর জন্য math requirement list বানান — কোন phase-এর কোন chapter সবচেয়ে relevant?
- Target company-এর job description ৫টি পড়ুন — common math skills কী চায়?
- CS229 problem set ১টি try করুন — theoretical proof + programming both।
- ৩Blue1Brown-এর Linear Algebra series দেখুন — geometric intuition refresh করুন।
Interview Questions
- Industry vs research role — আপনার math depth কীভাবে differ হওয়া উচিত?
- একটি new architecture paper পড়ে ১ সপ্তাহে implement করতে বলা হলে — approach?
- Math skill weak area identify করে — কীভাবে improve করবেন quickest?
- Production bug: model diverges — math-based debugging steps কী?
Summary · সারসংক্ষেপ
- AI Engineer = implement + debug + optimize — Linear Algebra, Calculus, Probability, Optimization depth।
- AI Researcher = invent + prove + publish — Matrix Calculus, Measure Theory, Learning Theory, convergence analysis।
- Interview: applied = shape/complexity/gradient diagnosis; research = proofs/bounds/novelty justification।
- Structured plan: 8 months base + ongoing paper reading + real system application।
- Books + courses + blogs + self-assessment = continuous mathematical growth।