Gouranga's Tech Blog: Import Data on Amazon RDS using Datapump utility

- From On Premise / EC2 to AWS RDS

The common question we have that "How we import data into an Amazon RDS DB instance?"

Yes, Answer to this question, there are many more challenges and it depends on the amount of data you have and the number and variety of database objects in your database.

For example, you can use Oracle SQL Developer to import a simple, 50 MB or 100M database. You can use Oracle Data Pump to import complex databases, or databases that are several hundred megabytes or several terabytes in size.

You can also use AWS Database Migration Service (AWS DMS) to import data into an Amazon RDS DB instance. AWS DMS can migrate databases without downtime and, for many database engines, continue ongoing replication until you are ready to switch over to the target database. You can migrate to Oracle from either the same database engine or a different database engine using AWS DMS. If you are migrating from a different database engine, you can use the AWS Schema Conversion Tool to migrate schema objects that are not migrated by AWS DMS. For more information about AWS DMS, see What is AWS Database Migration Service.

The following performance guidelines apply to all RDS data import/export operations:

Load and unload data in parallel using compression and multiple threads.

For large data loads, consider disabling automated backups by setting the backup retention for the RDS DB instance to zero; a restart of the RDS DB instance is necessary to enact this change. Disabling backups is not recommended for production instances. Disabling backups prevents point-in-time recovery functionality, deletes all previous automated snapshots for the DB instance, and prevents recovery of deleted snapshots. If the DB instance is not yet in production, disabling backups can improve load performance. This change should be immediately rolled back when the data load is complete. Also consider generating recovery points by creating manual snapshots at critical stages of the data load process.

Consider disabling Multi-AZ during the data load operation to reduce the overhead caused by the synchronous write operations of Multi-AZ data loads. Multi-AZ functionality is a recommended
best practice for all production RDS DB instances and should be enabled as soon as the data load is complete.

If you are loading a large amount of data in parallel, ensure that the client machine has sufficient resources to avoid becoming a bottleneck during the data load process.
Datapump utilities like expdp or impdp cannot be directly used on RDS database since there is no SSH connect allowed on the host where RDS database is running.

-- Scenario

Source : On Premise

hostname : example01
database : testdb

Target : AWS RDS
database : testorcl

Step-1: Take backup from On premise / EC2 instance

a) Create the directory on source database

SQL> create or replace directory export as '/u03/backup/export';
Directory created.

SQL> select name from v$database;
NAME
---------
TESTDB

b) Export the dumpfile from EC2 database (if you already have the exported dumpfile, skip this step)

expdp directory=export dumpfile=test1.dmp logfile=test1_expdp.log schemas=TEST_SCHEMA exclude=statistics

$ pwd
/u03/backup/export
$ ls
test1.dmp test1_expdp.log

Step-2: Create a database link to the target DB instance :

Create a database link between your source instance and your target DB instance. Note that your local Oracle instance must have network connectivity to the DB instance in order to create a database link and to transfer your export dump file.

Perform this step connected with the same user account as the previous step.

The following command creates a database link named to_rds that connects to the Amazon RDS master user at the target DB instance:

e.g.,
SQL>create database link to_rds connect to <master_user_account> identified by <password>using '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=<dns or ip address of remote db>)(PORT=<listener port>))(CONNECT_DATA=(SID=<remote SID>)))';

Note: Based on your connection provided by AWS, create your db-link. Assume "RDS_DBLINK_EXAMPLE" is created here.

Step-3 : Create privileged used where you need to import

You need to provide below grants to the user/ schema which one you will use to import.

grant read, write on directory DATA_PUMP_DIR to TEST_SCHEMA_RDS;
grant execute on dbms_datapump to TEST_SCHEMA_RDS;
grant select_catalog_role to TEST_SCHEMA_RDS;
grant create session,create table to TEST_SCHEMA_RDS;
grant execute on dbms_file_transfer to TEST_SCHEMA_RDS;
GRANT UNLIMITED TABLESPACE TO TEST_SCHEMA_RDS;

Note: Assume you gave already you have created user access. Give above access from admin user provided by AWS.

Step-4: Transfer the dump-file from On premise to RDS instance of AWS

Use DBMS_FILE_TRANSFER to copy the exported dump file to the target DB instance.

Use DBMS_FILE_TRANSFER to copy the dump file from the source database instance to the target DB instance.

Login to EC2 Oracle instance or on-premise Oracle instance, and copy the exported dumpfile from the local DATA_PUMP_DIR to RDS DATA_PUMP_DIR.

The following script copies a dump file named sample.dmp from the source instance to a target database link named RDS_DBLINK_EXAMPLE (created in the previous step):

Connect to rds instance from your local host / ec2 instance:

BEGIN
DBMS_FILE_TRANSFER.PUT_FILE(
source_directory_object =>'EXPORT',
source_file_name =>'test1.dmp',
destination_directory_object =>'DATA_PUMP_DIR',
destination_file_name => 'test1.dmp',
destination_database =>'RDS_DBLINK_EXAMPLE'
);
END;
/

Step-5: Check the datapump directory name in aws rds.

SQL> select owner,directory_name,directory_path from dba_directories;

OWNER |DIRECTORY_NAME |DIRECTORY_PATH
------------|--------------------|----------------------------------------
SYS |DATA_PUMP_DIR |/rdsoradata/datapump

Step-6 : Use DBMS_DATAPUMP or IMPDP to import the data file on the target DB instance

I saw lots of post but most of them not working properly. Below pl/sql block I found from Oracle document which is absolutely working fine.

Here, you need to change dump-file name, schema name and remap tablespace options.

set serveroutput on;
set timing on;
DECLARE
ind NUMBER; -- Loop index
h1 NUMBER; -- Data Pump job handler
percent_done NUMBER; -- Percentage of job complete
job_state VARCHAR2(30); -- To keep track of job state
le ku$_LogEntry; -- For WIP and error messages
js ku$_JobStatus; -- The job status from get_status
jd ku$_JobDesc; -- The job description from get_status
sts ku$_Status; -- The status object returned by get_status
BEGIN
h1 := DBMS_DATAPUMP.OPEN('IMPORT','FULL',NULL,'EXAMPLE1');
DBMS_DATAPUMP.ADD_FILE(h1,'test1.dmp','DATA_PUMP_DIR');
DBMS_DATAPUMP.METADATA_REMAP(h1,'REMAP_SCHEMA','TEST_SCHEMA','TEST_SCHEMA_RDS');
DBMS_DATAPUMP.METADATA_REMAP(h1,'REMAP_TABLESPACE','TEST_SCHEMA','TEST_SCHEMA_RDS');
DBMS_DATAPUMP.SET_PARAMETER(h1,'TABLE_EXISTS_ACTION','REPLACE');
DBMS_DATAPUMP.START_JOB(h1);

percent_done := 0;
job_state := 'UNDEFINED';
while (job_state != 'COMPLETED') and (job_state != 'STOPPED') loop
dbms_datapump.get_status(h1,
dbms_datapump.ku$_status_job_error +
dbms_datapump.ku$_status_job_status +
dbms_datapump.ku$_status_wip,-1,job_state,sts);
js := sts.job_status;

-- If the percentage done changed, display the new value.

if js.percent_done != percent_done
then
dbms_output.put_line('*** Job percent done = ' ||
to_char(js.percent_done));
percent_done := js.percent_done;
end if;

-- If any work-in-progress (WIP) or Error messages were received for the job,
-- display them.

if (bitand(sts.mask,dbms_datapump.ku$_status_wip) != 0)
then
le := sts.wip;
else
if (bitand(sts.mask,dbms_datapump.ku$_status_job_error) != 0)
then
le := sts.error;
else
le := null;
end if;
end if;
if le is not null
then
ind := le.FIRST;
while ind is not null loop
dbms_output.put_line(le(ind).LogText);
ind := le.NEXT(ind);
end loop;
end if;
end loop;

-- Indicate that the job finished and gracefully detach from it.

dbms_output.put_line('Job has completed');
dbms_output.put_line('Final job state = ' || job_state);
dbms_datapump.detach(h1);
END;
/

Step-7: Clean up

Connect to admin user of aws rds instance and run the below queries:

SQL> select * from table(RDSADMIN.RDS_FILE_UTIL.LISTDIR('DATA_PUMP_DIR'))
order by mtime;

FILENAME TYPE FILESIZE MTIME
--------------------------- ---------- ---------- ---------
test1.dmp file 290816 31-JAN-19
example.dmp file 7470804992 31-JAN-19

SQL>

e.g.,

SQL> exec utl_file.fremove('DATA_PUMP_DIR','test1.dmp');
SQL> exec utl_file.fremove('DATA_PUMP_DIR','example.dmp');

SQL> select * from table(RDSADMIN.RDS_FILE_UTIL.LISTDIR('DATA_PUMP_DIR'))
order by mtime;

FILENAME TYPE FILESIZE MTIME
----------------------------- ---------- ---------- ---------
datapump/ directory 4096 01-FEB-19

I hope it helped you.

29 comments:

DumpsPass4sureOctober 5, 2019 at 3:35 PM
Many study material were being offered by different sources but Pass4sure Checkpoint dumps proved itself the best one. I downloaded a free version of demo questions from Dumpspass4sure that gave me an overview of Checkpoint PDF exam dumps. I think this material can bring success to any average student.
latesttechnologyblogsNovember 15, 2019 at 5:18 PM
Enjoyed reading the article above, really explains everything in detail, the article is very interesting and effective. Thank you and good luck for the upcoming articles AWS Online Training
AnonymousDecember 18, 2019 at 8:16 PM
Hi,

Thank You so much for this article.

I want to IMPDP using network link and import some 5 Schemas.

I have given the following privileges to the Source user and Target user :

grant create session, create table, create database link to MYUSER;
alter user MYUSER quota unlimited on MYUSER_TS;
grant read, write on directory data_pump_dir to MYUSER;
grant select_catalog_role to MYUSER;
grant execute on dbms_datapump to MYUSER;
grant execute on dbms_file_transfer to MYUSER;

In Schemas line, I have

SCHEMAS=MySchema1,MySchema2,MySchema3,MySchema4,MySchema5

ReMap_Schema= (I have remapped accordingly)

When I execute IMPDP, I get the following Error :

ORA-31631: privileges are required
ORA-39109: Unprivileged users may not operate upon other users' schemas

In Normal Oracle DB, we achieve this by giving the "DATAPUMP_IMP_FULL_DATABASE" Privilege. Can you please let us know how do we achieve this in RDS and Import other Schemas as well?

Thanks,
Prasanna K
AnonymousDecember 18, 2019 at 8:48 PM
Hi,

I resolved it. Found a work-around.

Gave DBA Privilege to MYUSER and revoked it immediately after completing IMPDP.

Grant DBA to MYUSER;

Thanks,
Prasanna K
Richard RichiFebruary 28, 2020 at 1:13 AM
Hello, Import Data on Amazon RDS using Datapump utility
2020 Amazon exam dumps brought to me a definite success so I am fully satisfied with my selection. I did not need to work too hard but I studied smart. I went through the syllabus in a very short time and solved all the questions in the final test. Real Amazon pdf dumps will always remain my best choice.
Balajee NanduriJune 2, 2020 at 11:14 PM
Steps Worked. Thanks for sharing. Keep sharing more and more AWS Online Training Hyderabad
DevOpsAugust 11, 2020 at 4:34 PM
It was a great information. Thanks for sharing.
AWS Training
AWS Online Training
Amazon Web Services Online Training
AWS Training in Hyderabad
AWS Training in Ameerpet
johniFebruary 16, 2021 at 7:31 PM
Those guidelines additionally worked to become a good way to recognize that other people online have identical fervor like mine to grasp a great deal more around this condition. and I could assume you are an expert on this subject. Same as your blog i found another one Amazon Master Class .Actually I was looking for the same information on internet for Amazon Master Class and came across your blog. I am impressed by the information that you have on this blog. Thanks a million and please keep up the gratifying work.
Lafay Tech PlazaAugust 12, 2021 at 7:56 PM
Synology has introduced its latest version of DSM, which comes with a number of new features. One of the major features is theMigration Assistantthat makes it easy to migrate your data from another NAS or server to Synology NAS.

keerthanaDecember 24, 2021 at 4:13 PM
This comment has been removed by a blog administrator.
JohnFebruary 5, 2022 at 3:53 PM
Thanks for this blog keep sharing your thoughts like this...
Introduction of CC++
What is C++ Programming Language
BhavaniFebruary 21, 2022 at 5:38 PM
This comment has been removed by a blog administrator.
BobbyApril 10, 2022 at 7:11 PM
This was lovely too read
Charles JamesApril 26, 2022 at 5:03 PM
This comment has been removed by a blog administrator.
HanikaApril 26, 2022 at 5:14 PM
This comment has been removed by a blog administrator.
manishaApril 30, 2022 at 9:11 AM
This comment has been removed by a blog administrator.
manishaApril 30, 2022 at 9:32 AM
This comment has been removed by a blog administrator.
RajaniMay 19, 2022 at 12:58 PM
This comment has been removed by a blog administrator.
BhavaniMay 27, 2022 at 3:00 PM
This comment has been removed by a blog administrator.
BhavaniMay 28, 2022 at 11:20 AM
This comment has been removed by a blog administrator.
BhavaniMay 31, 2022 at 12:57 PM
This comment has been removed by a blog administrator.
BhavaniJune 24, 2022 at 1:17 PM
This comment has been removed by a blog administrator.
RajaniJuly 23, 2022 at 12:18 PM
This comment has been removed by a blog administrator.
milkaSeptember 22, 2022 at 12:38 PM
This comment has been removed by a blog administrator.
AdamOctober 18, 2022 at 3:05 PM
This comment has been removed by a blog administrator.
harleysonNovember 2, 2022 at 12:05 PM
This comment has been removed by a blog administrator.
nanitechNovember 28, 2022 at 1:09 PM
This comment has been removed by a blog administrator.
naniDecember 2, 2022 at 1:21 PM
This comment has been removed by a blog administrator.
RajaniMarch 3, 2023 at 4:50 PM
This comment has been removed by a blog administrator.

Gouranga's Tech Blog

Pages

Feb 6, 2019

Import Data on Amazon RDS using Datapump utility

29 comments:

Translate >>

Pages