• Visitors can check out the Forum FAQ by clicking this link. You have to register before you can post: click the REGISTER link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below. View our Forum Privacy Policy.
  • Want to receive the latest contracting news and advice straight to your inbox? Sign up to the ContractorUK newsletter here. Every sign up will also be entered into a draw to WIN £100 Amazon vouchers!

PDF data

Collapse
X
  •  
  • Filter
  • Time
  • Show
Clear All
new posts

    PDF data

    Does anybody know of any software (free!!!!) which would allow me to get to the data held in a PDF file?
    Rule Number 1 - Assuming that you have a valid contract in place always try to get your poo onto your timesheet, provided that the timesheet is valid for your current contract and covers the period of time that you are billing for.

    I preferred version 1!

    #2
    How do you mean? The data in a table in a PDF? Can you not just highlight, copy and paste into something like excel? Am I missing something?
    It's about time I changed this sig...

    Comment


      #3
      The report I might be getting will be really big and not really an option for cutting and pasting - I want something which will take a PDF and pump the data to a text file so that I can pull it into a database. Excel can't do more than 65000 rows and this will be a lot bigger.
      Rule Number 1 - Assuming that you have a valid contract in place always try to get your poo onto your timesheet, provided that the timesheet is valid for your current contract and covers the period of time that you are billing for.

      I preferred version 1!

      Comment


        #4
        http://www.google.co.uk/search?hl=en...e+Search&meta=
        This looks quite good and not a bad price:
        http://www.docsmartz.net/

        once its in a text format you might be able to use Microsofts log parser:
        http://www.microsoft.com/downloads/d...displaylang=en

        and a combination of other custom scripts to strip out what you actually want
        Coffee's for closers

        Comment


          #5
          Thanks folks. I've found one that will do it.
          Rule Number 1 - Assuming that you have a valid contract in place always try to get your poo onto your timesheet, provided that the timesheet is valid for your current contract and covers the period of time that you are billing for.

          I preferred version 1!

          Comment


            #6
            which one was that?
            Coffee's for closers

            Comment


              #7
              I'll let you know when I've tested it - it might not work yet!!
              Rule Number 1 - Assuming that you have a valid contract in place always try to get your poo onto your timesheet, provided that the timesheet is valid for your current contract and covers the period of time that you are billing for.

              I preferred version 1!

              Comment


                #8
                pdftotext is distributed with most *nixes and you can get it on Windows as part of cygwin, too.

                http://en.wikipedia.org/wiki/Pdftotext

                Edit - if you follow the link in the wikipedia article, there appears to be a download compiled with MSVC which does not require cygwin.
                Last edited by bored; 18 September 2007, 19:48.

                Comment


                  #9
                  Originally posted by bored View Post
                  pdftotext is distributed with most *nixes and you can get it on Windows as part of cygwin, too.

                  http://en.wikipedia.org/wiki/Pdftotext

                  Edit - if you follow the link in the wikipedia article, there appears to be a download compiled with MSVC which does not require cygwin.
                  Just tried that... works ok(ish) tables don't seem to come out of it too well though and it would be difficult to programmatically work with the resulting text file.
                  Coffee's for closers

                  Comment


                    #10
                    That is what I have been finding with all the free ones. They do things like drop the minus symbol also, which is a pain when it appears in some of the server names we have here (but not all). Also, they are not consistant when handling tables. Sometimes they place data from 3 columns in 3 seperate rows and other times they just join it all together.
                    Rule Number 1 - Assuming that you have a valid contract in place always try to get your poo onto your timesheet, provided that the timesheet is valid for your current contract and covers the period of time that you are billing for.

                    I preferred version 1!

                    Comment

                    Working...
                    X