Any multi-model LLM can read your notes, and cleanly interpret and describe diagrams you have drawn on them.
Qwen3 VL is my current favorite open-weight multimodal model.
That said, I'm quite certain the SOTA models (ChatGPT, Gemini, et al.)
all can as well.
Not only do these models outperform traditional OCR at reading, but they can also describe other details like layouts in any kind of format you like, including re-creating the image they're looking at in HTML or things like that.
I will, as an example, ask Qwen3 VL (32B, FP16- the dense model, not the sparse) what it sees while holding open a recent hotel access sleeve and card from a conference in Las Vegas.
Its response.
As you can see, they can read quite fine. And this is an open-weights model, running entirely locally on my laptop. And it's old as hell (at least in LLM terms).
I'm unsure why it's my job to tell you what you could have tried, with the knowledge you already have (That uhhh, ChatGPT... exists) to have seen this for yourself.
I'll grant you that I don't use the paid models very often (and certainly not for something local models can do fine), and that free-tier distilled SOTA models might even suck at this.
Paid models are going to be better than the LLMs you can run locally, though, and they can read notes with ease- even notes I don't find remotely legible- like the room number on that sleeve, which can only be described as chickenscratch.