• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

What programming languages would you use for this example web development project?

Collapse
X
  •  
  • Filter
  • Time
  • Show
Clear All
new posts

    What programming languages would you use for this example web development project?

    Technically speaking, what is the best way of doing the project below? What language, technologies would you use and why please ?

    Summary as follows:
    • Scrape a lot of data (text, pictures, videos) from multiple websites multiple times in a day (As an example: all vehicle data from autotrader.co.uk)
    • Combine all data and check for any changes in the data being scraped (e.g new pictures)
    • Make data available on a website on a real time basis (database?)
    • Data is searchable with multiple filters (think autotrader.co.uk again)



    Many thanks

    #2
    Not even a 'please'?

    Edit - Oh sorry, there was a please at the end.

    I'm sure someone will be along in a minute who's keen to prep you on your interview...
    "I can put any old tat in my sig, put quotes around it and attribute to someone of whom I've heard, to make it sound true."
    - Voltaire/Benjamin Franklin/Anne Frank...

    Comment


      #3
      I'll give you a few thoughts for free

      If you want to compare images to see if they are different (for example if the front of the car is always called front.jpg and the dealer replaces the old one with a new one) you can get a hash of the image and compare that. You could exploit this idea to compare text or something as well.

      I'd also look into using already available software to turn HTML docs into json and store that in the database. That way, data from autotrader can be kept in the same table as data from Mumsnet. Databases like Cosmodb or Snowflake can query the json documents so you could have the same search form compare engine sizes from autotrader or diaper sizes on mumsnet.

      Comment


        #4
        Just wondering why you would want to build a meta-serach engine for a site that already has a search engine...

        Anyway, we wrote something similar many years ago to combine a host of CMDBs to create an overall picture of the estate. We used C++ frontending the various DBMS tools already in place.

        And no, I don't have the code...
        Blog? What blog...?

        Comment


          #5
          VB6

          Comment


            #6
            dBase III

            Comment


              #7
              Lisp.
              …Maybe we ain’t that young anymore

              Comment


                #8
                Clojure
                Down with racism. Long live miscegenation!

                Comment


                  #9
                  The correct answer is python because parsing html comes out of the box. Yes it needs to go into a database as it should be searched. For images you can compare the size or even the name.

                  Sounds like a website that shows all the results of all the other websites doing the same thing. Copyright might be a problem there unless it's hosted in some far flung lawless land.
                  I'm alright Jack

                  Comment


                    #10
                    In the helpful spirit shown above: FORTRAN IV.
                    When the fun stops, STOP.

                    Comment

                    Working...
                    X