• Offers
    • RegisterLogin
      • Learn More
    PythonPoint.netPythonPoint.net
    • Offers
    • RegisterLogin
      • Learn More

      Python

      SKILL IS IMPORTANT THAN DEGREE Be skill full.
      • Home
      • Blog
      • Python
      • How to Read PDF File in Python

      How to Read PDF File in Python

      • Posted by Python Point Team
      • Categories Python
      • Date November 25, 2020
      • Comments 0 comment
      how to read pdf file in python

      PDF(Portable Document Format) is one of the most popular and widely used digital media. It is used to display and exchange documents assuredly, independent of software, hardware, or operating system.

      In this article, we will see how to read pdf file in Python. For that, we are using a third-party Python module PyPDF2. This module is capable of extracting document information, splitting documents page by page, merging documents, cropping pages, merging multiple pages into a single page, encrypting and decrypting PDF files etc.

      To install PyPDF2,

      pip install PyPDF2

      import PyPDF2
      
      pdf_FileOb = open('test.pdf', 'rb')
      
      pdf_Reader = PyPDF2.PdfFileReader(pdf_FileOb)
      
      print("The number of pages: ", pdf_Reader.numPages)
      
      page_Ob = pdf_Reader.getPage(0)
      
      print(page_Ob.extractText())
      
      pdf_FileOb.close()

      Output:

      The number of pages: 1
      
      Take
      Risks In Your Life
      
      If
      You Win, You Can Lead !
      
      
      
      -
      Swami Vivekananda

      Now lets see what all this code means.

      The first step is to import PyPDF2 module. After that, we are opening our PDF file using in open() function in the binary mode. The next step is to create an object of the opened file using the PdfFileReader class of the PyPDF2 module. We get a pdf reader object from this. The numpages property gives the number of pages in the pdf file. The getpage() function takes the page number as an argument and returns the page object. The function extractText() extract text from the selected pdf page. And finally, after doing all the operations on the PDF file, we have to close the file object. This can be done using close().

      You may find some similarities between the PyPDF2 operations and built-in file operations. Keep in mind that this module is not completely perfect. It may be unable to work with some particular PDF files

      • Share:
      author avatar
      Python Point Team

      Previous post

      How to set python path in Windows 10?
      November 25, 2020

      Next post

      How to connect Oracle database in Python
      November 25, 2020

      You may also like

      15 Powerful Step for Mastering JSON Parsing in Python: Boosting Data Manipulation and Validation
      21 June, 2023

      Introduction In the world of programming, data plays a crucial role, and managing it efficiently is of utmost importance. JSON (JavaScript Object Notation) has emerged as a popular data interchange format due to its simplicity and flexibility. In this article, …

      Introduction to Transfer Learning with Python: A Practical Guide
      31 December, 2022

      Introduction: Definition of transfer learning Overview of how transfer learning works in the context of machine learning Why transfer learning is useful and important Section 1: Transfer learning in Python with Keras In this section, we will explore how to …

      How to Check Type in Python
      31 December, 2022

      In this article, we will learn to check type in Python. The built-in function type() can be used to check the type of data in Python.

      Subscribe
      Login
      Notify of
      Please login to comment
      0 Discussion
      Inline Feedbacks
      View all comments

      Latest Courses

      (Hindi) Ways to earn minimum 1 Lakh Per month as Programmer

      (Hindi) Ways to earn minimum 1 Lakh Per month as Programmer

      ₹10,000
      (HINDI) Full Stack Web Development In Python 3.8 And Django 3.1

      (HINDI) Full Stack Web Development In Python 3.8 And Django 3.1

      ₹25,000 ₹2,500

      Latest Posts

      • 15 Powerful Step for Mastering JSON Parsing in Python: Boosting Data Manipulation and Validation
      • Introduction to Transfer Learning with Python: A Practical Guide
      • How to Check Type in Python
      • How to make web crawler in python?
      • Why was the language called “python”?
      Contact
      •   support@pythonpoint.com

      We get you the best Python Courses and Blogs aiming to provide skill.

      We Believe Skill is much more important than a Degree

      Company
      • About Us
      • Blog
      • Offers
      • Contact
      Useful Links
      • Courses
      Support
      • Need Support

      © 2020 ALL RIGHTS RESERVED​ PYTHONPOINT.NET

      PythonPoint

      • Terms of Use
      • Refund Policy
      • Privacy Policy

      Login with your site account

      Lost your password?

      Not a member yet? Register now

      Register a new account

      Are you a member? Login now

      wpDiscuz