Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

That's what I did for a recent side project. It worked really well, although coordinating the different services was a bit of a pain. We ended up using a cronjob to run everything repeatedly, which was inefficient but worked. If we had the time we would have used a proper RPC framework or some sort of queue.


What do you mean by 'run everything repeatedly'?


For instance, say I had 2 services, a scraper and then a ML algorithm that classified the scraped data. You only need to run the ML classifier when there's new data. I didn't have anything set up to tell the classifier "hey! There's new data to classify!" Instead, I just ran it every 5 minutes, which was sufficient latency for the project.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: