Feb 6, 2019

Import Data on Amazon RDS using Datapump utility

- From On Premise / EC2 to AWS RDS

The common question we have that "How we import data into an Amazon RDS DB instance?"

Yes, Answer to this question, there are many more challenges and it depends on the amount of data you have and the number and variety of database objects in your database.

For example, you can use Oracle SQL Developer to import a simple, 50 MB or 100M database. You can use Oracle Data Pump to import complex databases, or databases that are several hundred megabytes or several terabytes in size.

You can also use AWS Database Migration Service (AWS DMS) to import data into an Amazon RDS DB instance. AWS DMS can migrate databases without downtime and, for many database engines, continue ongoing replication until you are ready to switch over to the target database. You can migrate to Oracle from either the same database engine or a different database engine using AWS DMS. If you are migrating from a different database engine, you can use the AWS Schema Conversion Tool to migrate schema objects that are not migrated by AWS DMS. For more information about AWS DMS, see What is AWS Database Migration Service.

The following performance guidelines apply to all RDS data import/export operations:

Load and unload data in parallel using compression and multiple threads.

For large data loads, consider disabling automated backups by setting the backup retention for the RDS DB instance to zero; a restart of the RDS DB instance is necessary to enact this change. Disabling backups is not recommended for production instances. Disabling backups prevents point-in-time recovery functionality, deletes all previous automated snapshots for the DB instance, and prevents recovery of deleted snapshots. If the DB instance is not yet in production, disabling backups can improve load performance. This change should be immediately rolled back when the data load is complete. Also consider generating recovery points by creating manual snapshots at critical stages of the data load process.

Consider disabling Multi-AZ during the data load operation to reduce the overhead caused by the synchronous write operations of Multi-AZ data loads. Multi-AZ functionality is a recommended
best practice for all production RDS DB instances and should be enabled as soon as the data load is complete.

If you are loading a large amount of data in parallel, ensure that the client machine has sufficient resources to avoid becoming a bottleneck during the data load process.
Datapump utilities like expdp or impdp cannot be directly used on RDS database since there is no SSH connect allowed on the host where RDS database is running.


-- Scenario

Source : On Premise

hostname : example01
database : testdb


Target : AWS RDS
database : testorcl


Step-1:  Take backup from On premise / EC2 instance

a) Create the directory on source database

SQL> create or replace directory export as '/u03/backup/export';
Directory created.

SQL> select name from v$database;

NAME
---------
TESTDB

b) Export the dumpfile from EC2 database (if you already have the exported dumpfile, skip this step)

expdp directory=export dumpfile=test1.dmp logfile=test1_expdp.log schemas=TEST_SCHEMA exclude=statistics

$ pwd

/u03/backup/export
$ ls
 test1.dmp  test1_expdp.log


Step-2: Create a database link to the target DB instance :

Create a database link between your source instance and your target DB instance. Note that your local Oracle instance must have network connectivity to the DB instance in order to create a database link and to transfer your export dump file.

Perform this step connected with the same user account as the previous step.

The following command creates a database link named to_rds that connects to the Amazon RDS master user at the target DB instance:

e.g.,
SQL>create database link to_rds connect to <master_user_account> identified by <password>using '(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=<dns or ip address of remote db>)(PORT=<listener port>))(CONNECT_DATA=(SID=<remote SID>)))';

Note: Based on your connection provided by AWS, create your db-link. Assume "RDS_DBLINK_EXAMPLE" is created here.


Step-3 : Create privileged used where you need to import

You need to provide below grants to the user/ schema which one you will use to import.

grant read, write on directory DATA_PUMP_DIR to TEST_SCHEMA_RDS;
grant execute on dbms_datapump to TEST_SCHEMA_RDS;
grant select_catalog_role to TEST_SCHEMA_RDS;
grant create session,create table to TEST_SCHEMA_RDS;
grant execute on dbms_file_transfer to TEST_SCHEMA_RDS;
GRANT UNLIMITED TABLESPACE TO TEST_SCHEMA_RDS;

Note: Assume you gave already you have created user access. Give above access from admin user provided by AWS.


Step-4: Transfer the dump-file from On premise to RDS instance of AWS

Use DBMS_FILE_TRANSFER to copy the exported dump file to the target DB instance.

Use DBMS_FILE_TRANSFER to copy the dump file from the source database  instance to the target DB instance.

Login to EC2 Oracle instance or on-premise Oracle instance, and copy the exported dumpfile from the local DATA_PUMP_DIR to RDS DATA_PUMP_DIR.

The following script copies a dump file named sample.dmp from the source instance to a target database link named RDS_DBLINK_EXAMPLE (created in the previous step):


Connect to rds instance from your local host / ec2 instance:


BEGIN
DBMS_FILE_TRANSFER.PUT_FILE(
source_directory_object =>'EXPORT',
source_file_name =>'test1.dmp',
destination_directory_object =>'DATA_PUMP_DIR',
destination_file_name => 'test1.dmp',
destination_database =>'RDS_DBLINK_EXAMPLE'
);
END;
/


Step-5: Check the datapump directory name in aws rds.


SQL> select owner,directory_name,directory_path from dba_directories;

OWNER       |DIRECTORY_NAME      |DIRECTORY_PATH

------------|--------------------|----------------------------------------
SYS         |DATA_PUMP_DIR       |/rdsoradata/datapump



Step-6 : Use DBMS_DATAPUMP or IMPDP to import the data file on the target DB instance

I saw lots of post but most of them not working properly. Below pl/sql block I found from Oracle document which is absolutely working fine.

Here, you need to change dump-file name, schema name and remap tablespace options.

set serveroutput on;
set timing on;
DECLARE
  ind NUMBER;              -- Loop index
  h1 NUMBER;               -- Data Pump job handler
  percent_done NUMBER;     -- Percentage of job complete
  job_state VARCHAR2(30);  -- To keep track of job state
  le ku$_LogEntry;         -- For WIP and error messages
  js ku$_JobStatus;        -- The job status from get_status
  jd ku$_JobDesc;          -- The job description from get_status
  sts ku$_Status;          -- The status object returned by get_status
BEGIN
  h1 := DBMS_DATAPUMP.OPEN('IMPORT','FULL',NULL,'EXAMPLE1');
  DBMS_DATAPUMP.ADD_FILE(h1,'test1.dmp','DATA_PUMP_DIR');
  DBMS_DATAPUMP.METADATA_REMAP(h1,'REMAP_SCHEMA','TEST_SCHEMA','TEST_SCHEMA_RDS');
  DBMS_DATAPUMP.METADATA_REMAP(h1,'REMAP_TABLESPACE','TEST_SCHEMA','TEST_SCHEMA_RDS');
  DBMS_DATAPUMP.SET_PARAMETER(h1,'TABLE_EXISTS_ACTION','REPLACE');
  DBMS_DATAPUMP.START_JOB(h1);

 percent_done := 0;
  job_state := 'UNDEFINED';
  while (job_state != 'COMPLETED') and (job_state != 'STOPPED') loop
    dbms_datapump.get_status(h1,
           dbms_datapump.ku$_status_job_error +
           dbms_datapump.ku$_status_job_status +
           dbms_datapump.ku$_status_wip,-1,job_state,sts);
    js := sts.job_status;

-- If the percentage done changed, display the new value.


     if js.percent_done != percent_done

    then
      dbms_output.put_line('*** Job percent done = ' ||
                           to_char(js.percent_done));
      percent_done := js.percent_done;
    end if;

-- If any work-in-progress (WIP) or Error messages were received for the job,

-- display them.

       if (bitand(sts.mask,dbms_datapump.ku$_status_wip) != 0)

    then
      le := sts.wip;
    else
      if (bitand(sts.mask,dbms_datapump.ku$_status_job_error) != 0)
      then
        le := sts.error;
      else
        le := null;
      end if;
    end if;
    if le is not null
    then
      ind := le.FIRST;
      while ind is not null loop
        dbms_output.put_line(le(ind).LogText);
        ind := le.NEXT(ind);
      end loop;
    end if;
  end loop;

-- Indicate that the job finished and gracefully detach from it. 


  dbms_output.put_line('Job has completed');

  dbms_output.put_line('Final job state = ' || job_state);
  dbms_datapump.detach(h1);
END;
/


Step-7: Clean up

Connect to admin user of aws rds instance and run the below queries:

SQL> select * from table(RDSADMIN.RDS_FILE_UTIL.LISTDIR('DATA_PUMP_DIR')) 
order by mtime;

FILENAME                    TYPE         FILESIZE    MTIME

--------------------------- ---------- ---------- ---------
test1.dmp                   file         290816     31-JAN-19
example.dmp                 file         7470804992 31-JAN-19

SQL>


e.g.,

SQL> exec utl_file.fremove('DATA_PUMP_DIR','test1.dmp');
SQL> exec utl_file.fremove('DATA_PUMP_DIR','example.dmp');


SQL> select * from table(RDSADMIN.RDS_FILE_UTIL.LISTDIR('DATA_PUMP_DIR')) 

order by mtime;

FILENAME                       TYPE         FILESIZE MTIME

----------------------------- ---------- ---------- ---------
datapump/                     directory        4096 01-FEB-19


I hope it helped you.

29 comments:

  1. Many study material were being offered by different sources but Pass4sure Checkpoint dumps proved itself the best one. I downloaded a free version of demo questions from Dumpspass4sure that gave me an overview of Checkpoint PDF exam dumps. I think this material can bring success to any average student.

    ReplyDelete
  2. Enjoyed reading the article above, really explains everything in detail, the article is very interesting and effective. Thank you and good luck for the upcoming articles AWS Online Training

    ReplyDelete
  3. Hi,

    Thank You so much for this article.

    I want to IMPDP using network link and import some 5 Schemas.

    I have given the following privileges to the Source user and Target user :


    grant create session, create table, create database link to MYUSER;
    alter user MYUSER quota unlimited on MYUSER_TS;
    grant read, write on directory data_pump_dir to MYUSER;
    grant select_catalog_role to MYUSER;
    grant execute on dbms_datapump to MYUSER;
    grant execute on dbms_file_transfer to MYUSER;


    In Schemas line, I have

    SCHEMAS=MySchema1,MySchema2,MySchema3,MySchema4,MySchema5

    ReMap_Schema= (I have remapped accordingly)


    When I execute IMPDP, I get the following Error :

    ORA-31631: privileges are required
    ORA-39109: Unprivileged users may not operate upon other users' schemas

    In Normal Oracle DB, we achieve this by giving the "DATAPUMP_IMP_FULL_DATABASE" Privilege. Can you please let us know how do we achieve this in RDS and Import other Schemas as well?

    Thanks,
    Prasanna K

    ReplyDelete
  4. Hi,

    I resolved it. Found a work-around.

    Gave DBA Privilege to MYUSER and revoked it immediately after completing IMPDP.

    Grant DBA to MYUSER;

    Thanks,
    Prasanna K

    ReplyDelete
  5. Hello, Import Data on Amazon RDS using Datapump utility
    2020 Amazon exam dumps brought to me a definite success so I am fully satisfied with my selection. I did not need to work too hard but I studied smart. I went through the syllabus in a very short time and solved all the questions in the final test. Real Amazon pdf dumps will always remain my best choice.

    ReplyDelete
  6. Steps Worked. Thanks for sharing. Keep sharing more and more AWS Online Training Hyderabad

    ReplyDelete
  7. Those guidelines additionally worked to become a good way to recognize that other people online have identical fervor like mine to grasp a great deal more around this condition. and I could assume you are an expert on this subject. Same as your blog i found another one Amazon Master Class .Actually I was looking for the same information on internet for Amazon Master Class and came across your blog. I am impressed by the information that you have on this blog. Thanks a million and please keep up the gratifying work.

    ReplyDelete
  8. Synology has introduced its latest version of DSM, which comes with a number of new features. One of the major features is theMigration Assistantthat makes it easy to migrate your data from another NAS or server to Synology NAS.

    ReplyDelete
  9. This comment has been removed by a blog administrator.

    ReplyDelete
  10. This comment has been removed by a blog administrator.

    ReplyDelete
  11. This comment has been removed by a blog administrator.

    ReplyDelete
  12. This comment has been removed by a blog administrator.

    ReplyDelete
  13. This comment has been removed by a blog administrator.

    ReplyDelete
  14. This comment has been removed by a blog administrator.

    ReplyDelete
  15. This comment has been removed by a blog administrator.

    ReplyDelete
  16. This comment has been removed by a blog administrator.

    ReplyDelete
  17. This comment has been removed by a blog administrator.

    ReplyDelete
  18. This comment has been removed by a blog administrator.

    ReplyDelete
  19. This comment has been removed by a blog administrator.

    ReplyDelete
  20. This comment has been removed by a blog administrator.

    ReplyDelete
  21. This comment has been removed by a blog administrator.

    ReplyDelete
  22. This comment has been removed by a blog administrator.

    ReplyDelete
  23. This comment has been removed by a blog administrator.

    ReplyDelete
  24. This comment has been removed by a blog administrator.

    ReplyDelete
  25. This comment has been removed by a blog administrator.

    ReplyDelete
  26. This comment has been removed by a blog administrator.

    ReplyDelete

Translate >>