How to find a particular column in a database which is having n Number of tables.

Finding a particular column from a database which is having more number of tables.


Let's suppose we have sales data which contains some n number of tables and one need to find any particular column in database .

Here, taking example of sales database which has some 13 number of columns and in some of the tables there is some common column like "order_id", "customer_id".

So, manually it will be hectic or time consuming to find in each table.

Here is the python script to find the particular desired columns from all tables of the database.

Database_name = "sales"

Tables :  

  • customers table: contains customer information such as name, address, and contact details.
  • orders table: contains information about customer orders, including order date, product details, and total cost.
  • employees table: contains employee information such as name, address, and contact details.
  • products table: contains information about the products that are available for purchase, including product name, description, and price.
  • categories table: contains information about the different product categories, including category name and description.
  • suppliers table: contains information about the suppliers of the products, including supplier name, address, and contact details.
  • shippers table: contains information about the shipping companies used to deliver the products, including shipper name and contact details.
  • inventory table: contains information about the current stock levels of the products, including product name, quantity, and location.
  • sales table: contains information about the sales made, including sale date, product details, and total revenue.
  • payments table: contains information about the payments received from customers, including payment date, amount, and payment method.
  • expenses table: contains information about the expenses incurred by the business, including expense date, amount, and category.
  • returns table: contains information about the products that have been returned by customers, including return date, product details, and reason for return.
  • discounts table: contains information about the discounts that are available, including discount

Let's suppose desired column is order_id.

Code in Python :



OR

Genaralised Python script : 


OR

First printing table names which is present inside sales database, 

'fouund' --> order_id found in particular table

'Not found' --> order_id not found in particular table

  • spark.sql command used to convert any sql query to PySpark query.

Hyper link for Functions used in the above code : 



                    Thanks
Raman Gupta & Sandesh Deshmukh







Comments

  1. Well Explained..KUDOS to your efforts !!!
    Just display lists for found and not found tables rather than this sequences

    ReplyDelete

Post a Comment

Popular posts from this blog

Calculate Time taken by code snippets or a notebook to run in Databricks