The summer is finally coming to its end. The days are getting a little less warmer now. To be honest, I despise the heat. In India, it gets extremely hot here in New Delhi. I redesigned my website by the way. Now, there is search. There is graph, that way I can write connected notes/essays. There exists some bug here and there but, I guess I will figure them out eventually.
I have configured it on Cloudflare Pages this time. Last time, it was hosted on Github Pages. So, I decided to try something different. The search has been significantly improved and it will be helpful as I write more.
- We have a new term coined in the field of ML/AI, “context engineering”. I was listening to this podcast on CE.
- Deepseek had used GRPO in the R1 model that they released few months back. I was reading on why does GRPO works.
- OpenAI finally released an open source model: finetuning with gpt-oss
- Some, thoughts on GPT-OSS model:
- sliding window attention (ref: https://arxiv.org/abs/1901.02860)
- mixture of experts (ref: https://arxiv.org/abs/2101.03961)
- RoPE w/ Yarn (ref: https://arxiv.org/abs/2309.00071)
- attention sinks (ref: streaming llm https://arxiv.org/abs/2309.17453)
- Folks at Deepmind released this amazing book with several chapters on scaling LLMs.
- Found this amazing playlist on backend systems created using first principles.
- A good read on system design
- Some time back read about CPU in detail
- This is really cool HPC stuff on the internet, if you are into this then algorithms for the modern hardware is a must.