I am building an image archive that will ultimately have millions of images you can search. I'd love the hear your thoughts on technologies to use.

My list so far:- Storage - S3 CDN - CloudFront or CloudFlare Database - ? Front-End Framework - ?


I have worked on projects handling transfer of large image files across multiple vendors and being served on web for different targets (mobile, web etc.). I have also implemented cloud and DevOps solutions for high volume/scale of data and transactions.

While I could tell you a list of technologies which can be used - the problem here is more of design and architecture. So I will start with that. Also a lot of decisions are based on more information and the use case - so I will need information on certain aspects of your application.

1) For searching an image users will enter search terms. You will have to associate the search terms/tags to actual images. While there are multiple ways to do this association - what is the link between a search term and image in your case. Do you have a separate database/service which will link given search terms to given set of images?

2) Since you are talking about handling millions of images - they will need to be partitioned logically or bucketized for optimum way to store and search. Is there a natural distribution? Do your users access all the images all over the world?

3) For any data storage related to images - we will need to go one detail further. While relational DB might serve a lot of needs, there might be use cases where considering other storages will be needed. Some questions which will lead to answers are - what kind of data per image will be stored? How frequently this data might change? How many new additions will happen on this archive? How fast would you want the updates to be visible to users/consumers?

4) What kind of traffic do you expect roughly? This will decide a lot design around auto scaling, load balancing and other aspects.

5) I am not a front end expert - but similar principles of scaling would apply to front end stack. Though I can not necessarily answer what technologies might fit your needs best.

Answered 6 years ago

