• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

SKA news

Collapse
X
  •  
  • Filter
  • Time
  • Show
Clear All
new posts

    So I still don't understand - does ATW have a credible business model or not?
    Hard Brexit now!
    #prayfornodeal

    Comment


      Originally posted by sasguru View Post
      So I still don't understand - does ATW have a credible business model or not?
      He thinks a half is 10% and twitter does not work.

      I am going to say no.

      Comment


        Originally posted by minestrone View Post
        The majestic 12 bot hits my sites with such infequency that is never going to be a problem for him.

        Maybe it can tell from other links to your sites that it's already captured, that your site hasn't been updated since the last scan.

        After all SKA is only interested in links not content.

        Which is a shame as he could offer us cheap backups of certain websites (ahem) instead of us all having to download it. About time ISPs had some competition in the internet market.

        I've reserved InternetOnAStick.com for when the tech catches up. They can throttle and block the internet as much as they like once I've my own copy.
        Feist - 1234. One camera, one take, no editing. Superb. How they did it
        Feist - I Feel It All
        Feist - The Bad In Each Other (Later With Jools Holland)

        Comment


          Originally posted by PAH View Post
          If you're scraping so much every day, how do sites know you're not DoSing them?
          Well, there are over 100 mln active domains out there, just grabbing 10 links each over 24 hours would make up 1 bln.

          Comment


            Originally posted by sasguru View Post
            So I still don't understand - does ATW have a credible business model or not?
            It depends if you ask in General or not.
            Originally posted by MaryPoppins
            I'd still not breastfeed a nazi
            Originally posted by vetran
            Urine is quite nourishing

            Comment


              Originally posted by AtW View Post
              Well, there are over 100 mln active domains out there, just grabbing 10 links each over 24 hours would make up 1 bln.
              You have a system high on updates and low on reads, what makes twitter work is that they actually get people to read their data.

              Comment


                Originally posted by minestrone View Post
                You have a system high on updates and low on reads, what makes twitter work is that they actually get people to read their data.
                Twitter gets a lot of reads which makes things easy because their database is very small: 140 GB over 1 week period max (assuming all tweets use up all length which they don't). Their job is more trivial because they only show recent tweets and rank by recency, in situations when they have 1 mln tweets they just take more recent ones, which means in their full text implementation they can pretty much can drop inverted index data if it exceeds 1000 tweets, this makes it very easy.

                Comment


                  Originally posted by AtW View Post
                  Twitter gets a lot of reads which makes things easy because their database is very small: 140 GB over 1 week period max (assuming all tweets use up all length which they don't). Their job is more trivial because they only show recent tweets and rank by recency, in situations when they have 1 mln tweets they just take more recent ones, which means in their full text implementation they can pretty much can drop inverted index data if it exceeds 1000 tweets, this makes it very easy.
                  You seem to think running websites is purely down to DB datasize.

                  Comment


                    Originally posted by AtW View Post
                    Well, there are over 100 mln active domains out there, just grabbing 10 links each over 24 hours would make up 1 bln.
                    That sounds a bit daunting having to keep on top of that lot.

                    Can you build in some Quality of Content checks then throw out a list of websites worth visiting?

                    100 million domains yet I only ever visit about 10 of them on a daily basis.

                    Seems there's a lot of crap out there not worth bothering with. How to find the diamonds in the rough? Google searching is useless, someone needs to invent a better way.
                    Feist - 1234. One camera, one take, no editing. Superb. How they did it
                    Feist - I Feel It All
                    Feist - The Bad In Each Other (Later With Jools Holland)

                    Comment


                      Originally posted by minestrone View Post
                      You seem to think running websites is purely down to DB datasize.
                      Front end stuff is very easy to run in parallel very cheaply, it's the large scale DB that is a problem for companies like Twitter, Facebook, Google et al.

                      Real time nature of Twitter certainly made it harder to implement than usual batch processing however inherent advantage in terms of small text size and write once read many times approach make their problem fairly trivial to solve.

                      It's all really matter of perspective - when you spend your own £50k on stuff like this you'd have to be smart, but when you want to raise hundreds of millions making problem easily solveable will backfire.

                      Comment

                      Working...
                      X