Posts

Showing posts from January, 2023

How to find a particular column in a database which is having n Number of tables.

Image
Finding a particular column from a database which is having more number of tables. Let's suppose we have sales data which contains some n number of tables and one need to find any particular column in database . Here, taking example of sales database which has some 13 number of columns and in some of the tables there is some common column like " order_id ", "customer_id". So, manually it will be hectic or time consuming to find in each table. Here is the python script to find the particular desired columns from all tables of the database. Database_name = "sales" Tables :    customers table : contains customer information such as name, address, and contact details. orders table : contains information about customer orders, including order date, product details, and total cost. employees table : contains employee information such as name, address, and contact details. products table : contains information about the products that are available for purcha...

Calculate Time taken by code snippets or a notebook to run in Databricks

Image
 To calculate the time taken by code snippets or notebook to run in Databricks. Even one can find the elapsed time of particular code or bunch of code. Code in Scala : To check the how much time a SQL query is taking to run in Databricks. val startTime = System.nanoTime() val result = spark.sql("SELECT * FROM my_table") result.show() /we can remove this if one doesn't want to table result val endTime = System.nanoTime() val duration = (endTime - startTime) / 1e9 println(s"Query took $duration seconds") In above code output the start time is in timestamp format, to convert it into date. For that link is given below :  link to convert timestamp into date Code in Python/Pyspark : To check the how much time a code snippet is taking to run. Or even one can try it in whole notebook of Databricks. import time start_time = time.time() <put the code here> end_time = time.time() elapsed_time = end_time - start_time print(f'Elapsed time: {elapsed_time}') li...