From what you’re describing, it sounds like you’re trying to combine a few things that are tricky but doable:
Visual-based learning: You mentioned you process charts visually. That’s basically a computer vision problem. There’s a ton of research on using convolutional neural networks (CNNs) to “read” charts and patterns. You’d probably want to capture the chart frames and then feed them into a model trained to predict your next action based on your historical trades.
Front-end automation: You already cracked the mimic-your-clicks problem, which is huge. Once the AI is trained, the main challenge is hooking it up so that when a client triggers a token, it executes cleanly and safely. That’s mostly software engineering — making sure there’s proper sandboxing, logging, and error handling.
Training vs inference: You don’t necessarily need to train massive models from scratch if your dataset is limited (sounds like it is). You can start with a smaller, pre-trained model and fine-tune it on your own trades. Jupiter notebooks are fine for prototyping, but for production, you’ll want something more robust like PyTorch or TensorFlow deployed on a lightweight server, possibly even cloud if you need scalability. AWS, GCP, or Azure all have GPU instances for this, but you don’t have to use them if you have decent local hardware.
Data collection: The pixel-distance approach is interesting — basically turning your chart into features the AI can learn from. Just make sure your feature extraction is consistent (same chart sizes, timeframes, indicators, etc.), otherwise the model will struggle to generalize.
Honestly, this is ambitious, but very possible if you take it step by step: capture the chart images → label them with your actions → train a CV model → integrate with automation. You might want to start small with a single instrument or timeframe first.