Installation of Salesforce Libraries

In [ ]:
!pip install simple_salesforce
!pip install salesforce_bulk

Importing Commom Libraries

In [ ]:
import json
import datetime as dt
import pandas as pd
import numpy as np
import time as t 

Importing SalesForce Libraries

In [ ]:
from simple_salesforce import Salesforce,SalesforceLogin,SFType
from salesforce_bulk import SalesforceBulk
from salesforce_bulk.util import IteratorBytesIO
from urllib.parse import urlparse
from salesforce_bulk import CsvDictsAdapter
In [1]:
!pip show simple_salesforce
Name: simple-salesforce
Version: 1.11.1
Summary: A basic Salesforce.com REST API client.
Home-page: https://github.com/simple-salesforce/simple-salesforce
Author: Nick Catalano
Author-email: nickcatal@gmail.com
License: Apache 2.0
Location: c:\users\dell\anaconda3\lib\site-packages
Requires: requests, authlib
Required-by: salesforce-bulk
In [5]:
!pip show salesforce_bulk
Name: salesforce-bulk
Version: 2.2.0
Summary: Python interface to the Salesforce.com Bulk API.
Home-page: https://github.com/heroku/salesforce-bulk
Author: Scott Persinger
Author-email: scottp@heroku.com
License: MIT
Location: c:\users\dell\anaconda3\lib\site-packages
Requires: six, unicodecsv, requests, simple-salesforce
Required-by: 

Importing SalesForce Credentials

In [ ]:
import Salesforce_one
import Salesforce_two

  • Enter username, password and security_token (you are able to generate in using the salesforce org)
  • Path to generate salesforce token: My Settings->Reset My Security Token-> Click on Reset Security Token-> Token will come to your registered Email Id
    • Domain has two option:
    • 1) Login : When you are using developer Org
    • 2) Test : When you are using sandbox Org

In [ ]:
username = Salesforce_one.username
password = Salesforce_one.password
security_token = Salesforce_one.security_token
domain = 'login'

Starting a session in the Salesforce Org

In [ ]:
session_id, sf_instance = SalesforceLogin(username=username, password=password, security_token=security_token,domain=domain)
sf = Salesforce(instance=sf_instance,session_id=session_id)
print(username,sf)

Metadata of the Particular object and save the result in the excel or csv file

In [ ]:
md= sf.Object_name
md_metadata = At.describe()
df_md_metadata = pd.DataFrame(At_metadata.get('fields'))
df_md_metadata.to_csv('md_metadata.csv',index=False)

These 3 are the most important column that are been used in the Salesforce Org

In [ ]:
df_md_metadata[['label','name','type']]

These are the columns specified for the particular Object that are to be extracted from the salesforce Org

In [8]:
columns = """Id,ParentId,ActivityId,CreatedById,CreatedDate,LastModifiedDate,LastModifiedById,SystemModstamp,TextBody,HtmlBody,Headers,Subject,FromName,FromAddress,ValidatedFromAddress,ToAddress,CcAddress,BccAddress,Incoming,HasAttachment,Status,MessageDate,IsDeleted,ReplyToEmailMessageId,IsExternallyVisible,MessageIdentifier,ThreadIdentifier,IsClientManaged"""
columns
Out[8]:
'Id,ParentId,ActivityId,CreatedById,CreatedDate,LastModifiedDate,LastModifiedById,SystemModstamp,TextBody,HtmlBody,Headers,Subject,FromName,FromAddress,ValidatedFromAddress,ToAddress,CcAddress,BccAddress,Incoming,HasAttachment,Status,MessageDate,IsDeleted,ReplyToEmailMessageId,IsExternallyVisible,MessageIdentifier,ThreadIdentifier,IsClientManaged'
In [9]:
querySQQL = "select {} from Object_name".format(columns)
print(querySQQL)
select Id,ParentId,ActivityId,CreatedById,CreatedDate,LastModifiedDate,LastModifiedById,SystemModstamp,TextBody,HtmlBody,Headers,Subject,FromName,FromAddress,ValidatedFromAddress,ToAddress,CcAddress,BccAddress,Incoming,HasAttachment,Status,MessageDate,IsDeleted,ReplyToEmailMessageId,IsExternallyVisible,MessageIdentifier,ThreadIdentifier,IsClientManaged from Emailmessage

Extract the data in one time and only 200 rows are available in one go.

In [ ]:
response = sf.query(querySQQL)
lstRecords = response.get('records')
nextRecordsUrl = response.get('nextRecordsUrl')
print(response.get('totalSize'))
#This will print the total Size of the record that are present in the particular object

Using While Loop To extract all the rows that are present in the Object

In [11]:
while not response.get('done'):
    response = sf.query_more(nextRecordsUrl, identifier_is_url=True)
    lstRecords.extend(response.get('records'))
    nextRecordsUrl = response.get('nextRecordsUrl')

Converting the series of records in the python dataframe to manupilate and transform the data.

In [12]:
final_insert = pd.DataFrame(lstRecords).drop('attributes',axis=1)
final_insert.shape
Out[12]:
(763771, 28)

Check number of unique id's and shape of the dataframe are equal and that will confirm that there are not duplicate rows in the dataframe

In [ ]:
final_insert.Id.nunique()

Creating the new DataFrame and Copying all the columns to the new dataframe

Questions comes why we are doing that

  1. Change the datatype of the column
  2. Manuplate or Tranform the data
  3. Take the data from one salesforce org to another org but the data is same but has different column name
  4. It will be easily undertantable that what columns are we supposed to transfer to another org
In [182]:
df_em = pd.DataFrame()
In [183]:
df_em['Old_SF_ID__c'] = final_insert['Id']
df_em["BccAddress"]=final_insert["BccAddress"]
df_em["CcAddress"]=final_insert["CcAddress"]
df_em["ValidatedFromAddress__c"]=final_insert["ValidatedFromAddress"]
df_em["FromAddress"]=final_insert["FromAddress"]
df_em["FromName"]=final_insert["FromName"]
df_em["Headers"]=final_insert["Headers"]
df_em["HtmlBody"]=final_insert["HtmlBody"]
df_em["IsExternallyVisible"]=final_insert["IsExternallyVisible"]
df_em["Incoming"]=final_insert["Incoming"]
df_em["MessageDate"]=final_insert["MessageDate"]
df_em["Status"]=final_insert["Status"]
df_em["Subject"]=final_insert["Subject"]
df_em["TextBody"]=final_insert["TextBody"]
df_em["ToAddress"]=final_insert["ToAddress"]

This is the step to tranform or manupilate salesfore data by using for loop or if condition

Note:

In Salesforce Org we can not insert the null values. So we have to deal with the null values by filling with blank values.

In [184]:
df_em.fillna('',inplace = True)
In [185]:
df_em.shape
Out[185]:
(472, 15)

Checking the null values

In [186]:
df_em.isnull().sum()
Out[186]:
Old_MM_SF_ID__c            0
BccAddress                 0
CcAddress                  0
ValidatedFromAddress__c    0
FromAddress                0
FromName                   0
Headers                    0
HtmlBody                   0
IsExternallyVisible        0
Incoming                   0
MessageDate                0
Status                     0
Subject                    0
TextBody                   0
ToAddress                  0
dtype: int64

Login to another Salesforce Org where to transfer the data from Salesforce Org

In [ ]:
username = Salesforce_two.username
password = Salesforce_two.password
security_token = Salesforce_two.security_token
domain = 'login'
In [ ]:
session_id, sf_instance = SalesforceLogin(username=username, password=password, security_token=security_token,domain=domain)
sf = Salesforce(instance=sf_instance,session_id=session_id)
print(sf)

In this step we have to mention the object name that on which we have to performing the operation(insert,delete or update)

In [ ]:
project = SFType('Object_name',session_id,sf_instance)

Using SalesforceBulk api to perform insert, delete or update operation in salesforce Org and it has the operation of using Parallel computation also.

In [ ]:
#using Bulk Api to insert the data
bulk = SalesforceBulk(sessionId=session_id, host=urlparse('http://'+sf_instance).hostname)
job = bulk.create_insert_job("Object_name", contentType='CSV',concurrency='Parallel')
# this is to specify no of records u want to enter in each iteration
# which is totally depends on the size of the data.
for i in range(0,len(df_em),10000):  
    df=[]                        
    df = df_em[i:i+10000]
    res1=df.to_dict(orient='records')
    csv_iter = CsvDictsAdapter(iter(res1))
    batch = bulk.post_batch(job, csv_iter)
    result=bulk.wait_for_batch(job, batch)
bulk.close_job(job) 
#Most important step to close the job