Sunday, February 20, 2011

Scheduling Jobs in SQL Server Express - Part 2

In my previous article Scheduling Jobs in SQL Server Express we saw how to make simple job scheduling in SQL Server 2005 Express work. We limited the scheduling to one time or daily repeats. Sometimes this isn't enough. In this article we'll take a look at how to make a scheduling solution based on Service Broker worthy of the SQL Server Agent itself.

We will try to imitate scheduled jobs provided by SQL Server Agent like making a scheduled job made up of multiple job steps. Every job will have it's own custom job schedule similar to the one you can set up with SQL Server Agent. We'll also be able to add and remove scheduled jobs and job schedules and start and stop a scheduled job.
Note: All scheduled job dates HAVE TO BE handled in UTC time since conversation timer works only in UTC date format.

Security

In part 1 we were limited to making jobs run only in our database. Because of that we couldn't backup any other databases. Since that is a big drawback for a complete scheduling solution we need a fix for this. This happens because the activation stored procedure always runs under the EXECUTE AS security context which is more restrictive than normal db_owner context we use. This is well explained in the Books Online article Extending Database Impersonation by Using EXECUTE AS. There are two ways to go about this.
The first one which is also used in this article is to set our scheduling database to trustworthy:
ALTER DATABASE TestScheduledJobs SET TRUSTWORTHY ON
Setting the database to trustworthy gives the members of the db_owner role sysadmin privileges. So be careful with this setting.
The second option is to sign the activation stored procedure with a certificate. How to do this is excellently explained by Remus Rusanu, a former developer of Service Broker on his blog here. This is extremely useful if you need finer access control in your database.
Solving the security issue allows us to have one database for all scheduling which is what a proper scheduling engine should have.

Required Tables

We need 4 tables for this to work well. You can add more tables such as job history but you'll have to modify the stored procedures to use them.

ScheduledJobs Table

This holds information about our scheduled jobs such as job name, enabled status, etc...
CREATE TABLE ScheduledJobs
(
    ID INT IDENTITY(1,1), 
    JobScheduleId INT NOT NULL,
    ConversationHandle UNIQUEIDENTIFIER NULL,
    JobName NVARCHAR(256) NOT NULL DEFAULT (''),
    ValidFrom DATETIME NOT NULL,
    LastRunOn DATETIME, 
    NextRunOn DATETIME, 
    IsEnabled BIT NOT NULL DEFAULT (0),
    CreatedOn DATETIME NOT NULL DEFAULT GETUTCDATE()
)

ScheduledJobSteps Table

This holds the job step name, the SQL statement to run in the step, whether to retry the step on failure and how many times, step duration, etc...
CREATE TABLE ScheduledJobSteps
(
    ID INT IDENTITY(1,1),
    ScheduledJobId INT NOT NULL,    
    StepName NVARCHAR(256) NOT NULL DEFAULT (''), 
    SqlToRun NVARCHAR(MAX) NOT NULL, -- sql statement to run
    RetryOnFail BIT NOT NULL DEFAULT (0), -- do we wish to retry the job step on failure
    RetryOnFailTimes INT NOT NULL DEFAULT (0), -- if we do how many times do we wish to retry it
    DurationInSeconds DECIMAL(14,4) DEFAULT (0), -- duration of the step with all retries 
    CreatedOn DATETIME NOT NULL DEFAULT GETUTCDATE(),
    LastExecutedOn DATETIME
)

JobSchedules Table

This holds the job's schedule. Multiple jobs can have the same schedule. Here we specify an absolute or relative scheduling frequency. This is mimicking most of the SQL Server Agent scheduling options. For weekly scheduling number 1 is Monday and number 7 is Sunday. This is because the ISO standard says that a week starts on Monday.
CREATE TABLE JobSchedules
(
    ID INT IDENTITY(1, 1) PRIMARY KEY,
    FrequencyType INT NOT NULL CHECK (FrequencyType IN (1, 2, 3)), 
        -- daily = 1, weekly = 2, monthly = 3. "Run once" jobs don't have a job schedule 
    Frequency INT NOT NULL DEFAULT(1) CHECK (Frequency BETWEEN 1 AND 100),
    AbsoluteSubFrequency VARCHAR(100), 
        -- '' if daily, '1,2,3,4,5,6,7' day of week if weekly, '1,2,3,...,28,29,30,31' if montly    
    MontlyRelativeSubFrequencyWhich INT, 
    MontlyRelativeSubFrequencyWhat INT,
    RunAtInSecondsFromMidnight INT NOT NULL DEFAULT(0) 
        CHECK (RunAtInSecondsFromMidnight BETWEEN 0 AND 84599), -- 0-84599 = 1 day in seconds
    CONSTRAINT CK_AbsoluteSubFrequency CHECK 
        ((FrequencyType = 1 AND ISNULL(AbsoluteSubFrequency, '') = '') OR -- daily check
         (FrequencyType = 2 AND LEN(AbsoluteSubFrequency) > 0) OR -- weekly check (days of week CSV)
         (FrequencyType = 3 AND (LEN(AbsoluteSubFrequency) > 0 -- monthly absolute option (days of month CSV)
             AND MontlyRelativeSubFrequencyWhich IS NULL 
             AND MontlyRelativeSubFrequencyWhat IS NULL)
             OR ISNULL(AbsoluteSubFrequency, '') = '') -- monthly relative option
                ), 
    CONSTRAINT MontlyRelativeSubFrequencyWhich CHECK -- only allow values if frequency type is monthly
         (MontlyRelativeSubFrequencyWhich IS NULL OR 
           (FrequencyType = 3 AND 
            AbsoluteSubFrequency IS NULL AND 
            MontlyRelativeSubFrequencyWhich IN (1,2,3,4,5)) -- 1st-4th, 5=Last
                                              ), 
    CONSTRAINT MontlyRelativeSubFrequencyWhat CHECK  -- only allow values if frequency type is monthly
      (MontlyRelativeSubFrequencyWhich IS NULL OR 
       (FrequencyType = 3 AND 
        AbsoluteSubFrequency IS NULL AND
        MontlyRelativeSubFrequencyWhich IN (1,2,3,4,5,6,7,-1)) -- 1=Mon to 7=Sun, -1=Day
                                              )
)
For monthly relative scheduling you can easily set the first/second/third/fourth/last day of the week or of the month.

SchedulingErrors Table

This table contains information about any errors that have happened in our job execution. Once a job errors out it will stop without any further execution.
CREATE TABLE SchedulingErrors
(
    Id INT IDENTITY(1, 1) PRIMARY KEY,
    ScheduledJobId INT, 
    ScheduledJobStepId INT,    
    ErrorLine INT,
    ErrorNumber INT,
    ErrorMessage NVARCHAR(MAX),
    ErrorSeverity INT,
    ErrorState INT,    
    ErrorDate DATETIME NOT NULL DEFAULT GETUTCDATE()
)

Required User-Defined Functions

dbo.GetNextRunTime

This calculates the next time our job will be run based on the schedule we created and the last run time or last ValidFrom time of the job. In the function we use an excellent calendar table F_TABLE_DATE and a user defined function called dbo.F_ISO_WEEK_OF_YEAR that gets the ISO Week number for a date. Both were created by Michael Valentine Jones, a regular on the SQLTeam.com forums. The GetNextRunTime function is quite complex but the comments in code should provide enough information for complete understanding. If they don't, ask questions in the comments.
The following code shows the part of the UDF that calculates the next date to run for the daily scheduling type:
-- ...  
-- DAILY SCHEDULE TYPE
IF @FrequencyType = 1
BEGIN        
    SELECT TOP 1 @NextRunTime = DATE
    FROM (
            SELECT  DATEADD(s, @RunAtInSecondsFromMidnight, DATE) 
                AS DATE, ROW_NUMBER() OVER(ORDER BY DATE) - 1 AS CorrectDaySelector
            FROM    dbo.F_TABLE_DATE(@LastRunTime, DATEADD(d, 2*@Frequency, @LastRunTime))
         ) t
    WHERE   DATE > @LastRunTime
            AND CorrectDaySelector % @Frequency = 0
    ORDER BY DATE
END
-- ...

Required Stored Procedures

These stored procedures make a nice and friendly interface to the scheduling functionality.
  • usp_AddJobSchedule: Adds a new job schedule.
  • usp_RemoveJobSchedule: Removes an existing job schedule.
  • usp_AddScheduledJob: Adds a new scheduled job. For "Run once" job types we don't need a Job Schedule so for @JobScheduleId parameter we have to pass -1 and for @NextRunOn we have to set a date in the future in UTC time.
  • usp_RemoveScheduledJob: Removes an existing scheduled job.
  • usp_AddScheduledJobStep: Adds a new scheduled job step for a job. It also calculates the next run time of the job.
  • usp_RemoveScheduledJobStep: Removes an existing scheduled job step from a job.
  • usp_StartScheduledJob: Used when starting a new or job or re-enabling an old disabled job by passing a new ValidFrom date and in the activation stored procedure to start the job anew for the next scheduled run.
  • usp_StopScheduledJob: Stops the scheduled job run by ending the conversation for it and setting it to disabled.
  • usp_RunScheduledJobSteps: Runs every job step and repeats it the set number of times if that option is enabled. After each successful job step execution the step duration time and last run on time is set.
  • usp_RunScheduledJob:  Activation stored procedure that is NOT meant to be run by hand. You should run it by hand only for debugging, when your messages are left in the queue. The stored procedure receives the dialog timer message from the queue and finds the scheduled job that corresponds with its conversation handle. After it gets the Scheduled job it runs its Job steps, sets the dialog timer for the next scheduled value and updates the last run time of the job.

Scheduling Code Examples

DECLARE @JobScheduleId INT, 
 @ScheduledJobId INT, 
 @validFrom DATETIME, 
 @ScheduledJobStepId INT, 
 @secondsOffset INT, 
 @NextRunOn DATETIME

SELECT    @validFrom = GETUTCDATE(), -- the job is valid from current UTC time
         -- run the job 2 minutes after the validFrom time. 
         -- we need the offset in seconds from midnight of that day for all jobs
        @secondsOffset = 28800, -- set the job time time to 8 in the morning of the selected day
        @NextRunOn = DATEADD(n, 1, @validFrom) -- set next run for once only job to 1 minute from now

-- SIMPLE RUN ONCE SCHEDULING EXAMPLE
-- add new "run once" scheduled job 
EXEC usp_AddScheduledJob @ScheduledJobId OUT, -1, 'test job', @validFrom, @NextRunOn
-- add just one simple step for our job
EXEC usp_AddScheduledJobStep @ScheduledJobStepId OUT, @ScheduledJobId, 'EXEC sp_updatestats', 'step 1'
-- start the scheduled job
EXEC usp_StartScheduledJob @ScheduledJobId 

-- SIMPLE DAILY SCHEDULING EXAMPLE
-- run the job daily
EXEC usp_AddJobSchedule @JobScheduleId OUT,
                        @RunAtInSecondsFromMidnight = @secondsOffset,
                        @FrequencyType = 1,
                        @Frequency = 1 -- run every day                      
-- add new scheduled job 
EXEC usp_AddScheduledJob @ScheduledJobId OUT, @JobScheduleId, 'test job', @validFrom
DECLARE @backupSQL NVARCHAR(MAX)
SELECT  @backupSQL = N'DECLARE @backupTime DATETIME, @backupFile NVARCHAR(512); 
    SELECT @backupTime = GETDATE(), 
    @backupFile = ''C:\Program Files\Microsoft SQL Server\MSSQL.1\MSSQL\Backup\AdventureWorks_'' + 
         replace(replace(CONVERT(NVARCHAR(25), @backupTime, 120), '' '', ''_''), '':'', ''_'') + 
            N''.bak''; 
    BACKUP DATABASE AdventureWorks TO DISK = @backupFile;'

EXEC usp_AddScheduledJobStep @ScheduledJobStepId OUT, @ScheduledJobId, @backupSQL, 'step 1'
-- start the scheduled job
EXEC usp_StartScheduledJob @ScheduledJobId 

-- COMPLEX WEEKLY ABSOLUTE SCHEDULING EXAMPLE
-- run the job on every tuesday, wednesday, friday and sunday of every second week
EXEC usp_AddJobSchedule @JobScheduleId OUT,
                        @RunAtInSecondsFromMidnight = @secondsOffset,
                        @FrequencyType = 2, -- weekly frequency type
                        @Frequency = 2, -- run every every 2 weeks,
                        @AbsoluteSubFrequency = '2,3,5,7' 
     -- run every Tuesday(2), Wednesday(3), Friday(5) and Sunday(7)    
-- add new scheduled job 
EXEC usp_AddScheduledJob @ScheduledJobId OUT, @JobScheduleId, 'test job', @validFrom
-- add three steps for our job
EXEC usp_AddScheduledJobStep @ScheduledJobStepId OUT, @ScheduledJobId, 'EXEC sp_updatestats', 'step 1'
EXEC usp_AddScheduledJobStep @ScheduledJobStepId OUT, @ScheduledJobId, 'DBCC CHECKDB', 'step 2'
EXEC usp_AddScheduledJobStep @ScheduledJobStepId OUT, @ScheduledJobId, 
  'select 1,', 'step 3 will fail', 1, 2 -- retry on fail 2 times
-- start the scheduled job
EXEC usp_StartScheduledJob @ScheduledJobId 

-- COMPLEX RELATIVE SCHEDULING SCHEDULING EXAMPLE
DECLARE @relativeWhichDay INT, @relativeWhatDay INT
SELECT    @relativeWhichDay = 4, -- 1 = First, 2 = Second, 3 = Third, 4 = Fourth, 5 = Last
        @relativeWhatDay = 3 -- 1 = Monday, 2 = Tuesday, ..., 7 = Sunday, -1 = Day
-- run the job on the 4th monday of every month 
EXEC usp_AddJobSchedule @JobScheduleId OUT,
                        @RunAtInSecondsFromMidnight = @secondsOffset, -- int
                        @FrequencyType = 3, -- monthly frequency type
                        @Frequency = 1, -- run every month,
                        @AbsoluteSubFrequency = NULL, -- no aboslute frequence if relative is set
                        @MontlyRelativeSubFrequencyWhich = @relativeWhichDay,
                        @MontlyRelativeSubFrequencyWhat = @relativeWhatDay
/*
some more relative monthly scheduling examples
run on:
the first day of the month:
  - @MontlyRelativeSubFrequencyWhich = 1, @MontlyRelativeSubFrequencyWhat = -1
the third thursday of the month:
  - @MontlyRelativeSubFrequencyWhich = 3, @MontlyRelativeSubFrequencyWhat = 4
the last sunday of the month:
  - @MontlyRelativeSubFrequencyWhich = 5, @MontlyRelativeSubFrequencyWhat = 7
the second wedensday of the month:
  - @MontlyRelativeSubFrequencyWhich = 2, @MontlyRelativeSubFrequencyWhat = 3
*/
-- add new scheduled job 
EXEC usp_AddScheduledJob @ScheduledJobId OUT, @JobScheduleId, 'test job', @validFrom
-- add just one simple step for our job
EXEC usp_AddScheduledJobStep @ScheduledJobStepId OUT, @ScheduledJobId, 'EXEC sp_updatestats', 'step 1'
-- start the scheduled job
EXEC usp_StartScheduledJob @ScheduledJobId 

-- SEE WHAT GOING ON WITH OUR JOBS
-- show the currently active conversations
-- look at dialog_timer column (in UTC time) to see when will the job be run next
SELECT GETUTCDATE(), dialog_timer, * FROM sys.conversation_endpoints
-- shows the number of currently executing activation procedures
SELECT * FROM sys.dm_broker_activated_tasks
-- see how many unreceived messages are still in the queue. should be 0 when no jobs are running
SELECT * FROM ScheduledJobQueue WITH (NOLOCK)
-- view our scheduled jobs' statuses
SELECT * FROM ScheduledJobs  WITH (NOLOCK)
SELECT * FROM ScheduledJobSteps WITH (NOLOCK)
SELECT * FROM JobSchedules  WITH (NOLOCK)
SELECT * FROM SchedulingErrors WITH (NOLOCK)

Conclusion

When we look at this solution we can see it's a pretty powerful scheduling engine. If needed it can be extended for hourly or minute-level scheduling but that is an exercise for the reader. The script might look overly complex but once you have it set up it's use is fast and easy

No comments:

Post a Comment