help parallel                                                                                                               also see:  miparallel
-------------------------------------------------------------------------------------------------------------------------------------------------



Title



    parallel -- Stata module for Parallel computing



Index






    Sections                  



    1.  Syntax                Command syntax.
    2.  Description           Command description.
    3.  Details               How does parallel works.
    4.  Parallel Append       Using -parallel append- syntax.
    5.  Caveats               Things to consider before using parallel.
    6.  Technical note        Some details under the hood.
    7.  Examples              Some examples using parallel
    8.  Saved results         A list of parallel's save results
    9.  Citation              How cite parallel.
    10. Development           Up-to-date version and bug reporting
    11. Source code           parallel's (MATA) source code
    12. Authors               Authors behind parallel
    13. Contributors          Notable contributors
    14. Also see              Other modules related to parallel
    15. FAQs                  Frequently Asked Questions









    Available commands         



    1.  parallel initialize    Setting the number of child processes.
    2.  parallel numprocessors Getting the number of processors on the system.
    3.  parallel do            Parallelizing a do-file.
    4.  parallel : (prefix)    Parallelizing a Stata command (parallel prefix).
    5.  parallel bs            Parallel bootstrapping.
    6.  parallel sim           Parallel simulate.
    7.  parallel append        Multiple file processing and appending.
    8.  parallel clean         Removing auxiliary files.
    9.  parallel printlog      Checking out child processes' log files.
    10.  parallel version      Query parallel current version.
    11.  parallel citation     How to cite parallel.






1. Syntax



    ---------------------------------------------------------------------------------------------------------------------------------------------
    Setting the number of child processes (threads/processors)



        parallel initialize [ # , force statapath(stata_path) includefile(filename) hostnames(string) ssh(string) procexec(int)]



    ---------------------------------------------------------------------------------------------------------------------------------------------
    Getting the number of processors on the system



        parallel numprocessors



    ---------------------------------------------------------------------------------------------------------------------------------------------
    Parallelizing a do-file



        parallel do filename [, by(varlist) force nodata setparallelid(pll_id) execution_options]



    ---------------------------------------------------------------------------------------------------------------------------------------------
    Parallelizing a Stata command (parallel prefix)



        parallel [, by(varlist) force keep nodata setparallelid(pll_id) execution_options]:  command



    ---------------------------------------------------------------------------------------------------------------------------------------------
    Parallel bootstrapping



        parallel bs [, expression(exp_list) execution_options bs_options ] [: command]



    ---------------------------------------------------------------------------------------------------------------------------------------------
    Parallel simulate



        parallel sim [ , expression(exp_list) execution_options sim_options ] [: command]



    ---------------------------------------------------------------------------------------------------------------------------------------------
    Multiple file processing and appending



        parallel append [file(s)] , do(cmd|dofile) [in(in) if(if) expression(expand expression (see details)) execution_options ]



    ---------------------------------------------------------------------------------------------------------------------------------------------
    Removing auxiliary files



        parallel clean [, event(pll_id) all force]



    ---------------------------------------------------------------------------------------------------------------------------------------------
    Checking out child processes' logfiles by printing the output.



        parallel printlog [#] [, event(pll_id)]



    Checking out child processes' logfiles by showing the output in a view window.



        parallel viewlog [#] [, event(pll_id)]



    ---------------------------------------------------------------------------------------------------------------------------------------------
    Query parallel current version



        parallel version



        parallel citation



    options          Description
    -------------------------------------------------------------------------------------------------------------------------------------------
    Setting the number of child processes
      #               The number of child processes. If omitted the default is max(floor(num_processors*0.75),1)
      force          Overrides the restriction on using more child processes than processors on your machine (see the WARNING in description).
                       This option is assumed when specifying hostnames.
      statapath      File path. parallel tries to automatically identify Stata's exe path. By using this option you will override this and
                       force parallel to use a specific path to stata.exe.
      includefile    File path. This file will be included before parallel commands are executed. The target purpose for this is to allow one
                       to copy over preferences that parallel does not copy automatically.
      hostnames       a space delimited list of hostnames. For the local machine, use localhost.  Work will be assigned in the order of the
                       list and the list elements will be re-used if num child processes is longer than the list.  An example would be
                       localhost node2 node3.  If no option is provided, then localhost is assumed. Leave blank for local execution.
      ssh            The command used to connect to remote machines.  If none is provided, this will be ssh. This option is not needed for
                       local execution.
      procexec        On Windows, controls how child processes are spawned.  The default value 2 will launch them in a hidden desktop (they can
                       still be seen in the task manager) so that the child applications don't briefly steal the window focus (which is
                       annoying).  With value 1 the child processes will be launched in the user's desktop, will be launch auto-minimized, but
                       will still briefly steal the focus.  and will steal focus and perhaps briefly show the windows of the child processes.



    execution_options
      keep           Keeps auxiliary files generated by parallel.  Use this and the next option with care as there can be many file that take
                       up space.
      keeplast       Keeps auxiliary files and remove those last time saved during the current session.
      programs       A list of programs to be passed to each child process.  To do this, parallel needs to echo the contents of those programs
                       to the output window.  If parallel is being run from inside an ado (say my_cmd.ado) and you need to access local
                       subroutines (other programs defined in the ado beside the primary my_cmd), then you must pass their names in this option
                       as my_cmd.local_subroutine_name for them to be accessible.
      mata           If the algorithm needs to use mata objects, this option allows to pass to each child process every mata object loaded in
                       the current session (including functions).  Note that when mata objects are loaded into the child processes they will
                       have different locations and therefore pointers may no longer be accurate.
      noglobal       Avoid passing current session's globals to the child processes.
      seeds          Numlist. With this option the user can pass an specific seed to be used within each child process.
      randtype       String. Tells parallel whether to use the current seed (-current-), the current datetime (-datetime-) or random.org API
                       (-random.org-) to generate the seeds for each child processes (please read the Description section).
      processors     Integer. If running on StataMP, sets the number of processors each child process should use. Default value is 1, to help
                       avoid the sum total of Stata processes across child instances being more than the number of physical processors (which
                       can severly limit performance).
      timeout        Integer. If a child process hasn't started, how much time in seconds does parallel has to wait until assume that there was
                       a connection error and thus the child process won't start. Default value is 60.
      outputopts      A list of option names that are aggregating output options.  parallel automtically aggregates main data from child
                       processes.  Often, though, a program will aggregate more than one type of data.  outputopts allows generic file-based
                       aggregation (appending).  A sequential call such as my_prog, output1(outputfile.dta) can be converted to parallel,
                       outputopts(output1): my_prog, output1(outputfile.dta).  parallel will execute each child process with its own file
                       passed to output1 and at the end, append them all and save it to outputfile.dta.
      deterministicoutput
                       will eliminates displayed output that would vary depending on the machine (e.g. timers, seeds, and number of parallel
                       child processes) so that log files can be easily compared across runs. Errors are still printed.



    Byable parallelization
      by             Varlist. Tells the command through which observations the current dataset can be divided, avoiding stories (panel)
                       splitting over two or more child processes.  The semantics for by are not the same as for Stata.  When Stata implements
                       by, the command that is run will only see a section of the data where the by-variables are the same.  parallel's
                       semantics are that no observations with the same by-values will be in different child processes.  It pools together
                       combinations when there are fewer child processes than by-var combinations.  If you need Stata-style semantics, the
                       solution is to add by in the subcommand.  For example, parallel, by(byvar): by byvar: egen x_max = max(x).
      force          When using by, parallel checks whether if the dataset is properly sorted. By using force the command skips this check.



    Parallel bootstrap
      expression     An exp_list to be passed to the bootstrap command.
      bs_options      Further options to be passed to the bootstrap command, including the optional reps() parameter.



    Parallel simulate
      expression     An exp_list to be passed to the simulate command.
      sim_options     Further options to be passed to the simulate command, including the required reps() parameter.



    Multiple file processing and appending
      do             Stata cmd or dofile.  Note that parallel do does not support passing options to the do-file.  If you need arguments then
                       use the prefix style.
      files          Explicit list of files to process.
      expression     String. Expression representing file names in the form of "%fmts,  numlist1 [, numlist2 [, ...]]"



    Removing auxiliary files
      event          String. Specifies which executed (and stored) event's files should be removed.
      all            Tells parallel to remove every remnant auxiliary files generated by it in the current directory.
      force          Forces the command to remove (apparently) in-use auxiliary files. Otherwise these will not get deleted.



    Other options
      event          String. With printlog and viewlog this specifies which event's log files should be displayed.
      setparallelid  Programmers' option. Forces parallel to use an specific id (pll_id) (see Technical Notes).
      nodata         Tells parallel not to use loaded data and thus not to try splitting or appending anything.



2. Description



    -parallel- allows to implement parallel computing, without having StataMP, substantially reducing computing time. Specially suitable for
    bootstrapping and simulations, parallel includes out-of-the-box tools for implementing such algorithms.



    In order to use -parallel- it is necessary to set the number of desired child processes with which the user wants to work with. To do this
    the user should use -parallel initialize- syntaxes, replacing # with the desired number of child processes. Setting more child processes
    than physical cores the user's computer has it is not recommended (see the WARNING in description).



    -parallel do- is the equivalent (wrapper) to -do-. When using this syntax parallel runs the dofile in as many child processes as there
    where specified by the user, this is, start $PLL_CHILDREN Stata instances in batch mode. By default the loaded dataset will be split into
    the number of child processes specified by -parallel initialize- and the do-file will be executed independently over each and every one of
    the data chunks, so once after all the parallel-instances stops, the datasets will be appended. In order to avoid loading the current
    dataset in the child processes, the user should specify the -nodata- option.



    -parallel :- (as a prefix) allows to, after splitting the loaded dataset, execute a stata_cmd over the specified number of data chunks in
    order to speed up computations. Like -parallel do-, after all the parallel-instances stops, the datasets will be appended.



    -parallel bs- and -parallel sim- are parallel wrappers for the commands -bootstrap- and -simulate-. Specially suited for these algorithms,
    -parallel- allows conducting embarrassingly parallel computing. In terms of syntax, besides cmd names, the only difference that these two
    commands have with their serial versions is how are expressions passed (please refer to the examples section for this).



    Every time that -parallel- runs several auxiliary files are generated which, after finishing, are automatically deleted. In the case that
    the user sets -keep- or -keeplast- the auxiliary files are kept, thus the syntax -parallel clean- becomes handy. With -parallel clean- the
    user can remove the last generated auxiliary files (default option), an specific parallel instance files (using #pll_id number), or every
    stored auxiliary file (with -all-). For security reasons, in-use auxiliary files will not be deleted unless the user specifies it through
    the option force, which will override not deleting in-use auxiliary files (see the Technical note section for more information about this).
    Log files from the runs are stored in c(tmpdir) so that they can be inspected by the user.  The user will likely want to delete these
    periodically with parallel clean, all.



    In the case of handling multiple files (because it is, for example, a big dataset divided into multiple dta files), -parallel append-
    becomes handy as it allows the user to process them simultaneously. By providing a list of files and a Stata cmd or dofile, -parallel
    append- opens and executes the cmd/dofile within each file, stores each file results and appends them into a single file.  Also, if the
    files to be processed have a pattern base name, the user can provide -parallel append- with an expression representing the list of files to
    be processed; for information on how to use this feature, see the section Parallel Append.



    Given N child processes, within each child process -parallel- creates the macros pll_id (equal for all the child processes) and
    pll_instance (ranging 1 up to N, equaling 1 inside the first child process and N inside the last child process), both as globals and locals
    macros. This allows the user setting different tasks/actions depending on the child process. Also the global macro PLL_CHILDREN (equal to
    N) is available within each child process. For an example using this macros, please refer to the Examples section.



    As by now, -parallel- by default automatically identifies Stata's executable file path. This is necessary as it is used to run Stata in
    batch mode (the mainstream of the module). Either way, after some reports, that file path is not always correctly identified; where the
    option -statadir- in -parallel initialize- can be used to manually set it.



    In the case of pseudo-random-numbers, the module allows to pass different seed for each child process. Moreover, if the user does not
    provide a numlist of seeds, -parallel- generates its own numlist of seeds using three different options:  (1) based on the current seed;
    (2) using the current datetime and user as a seed to generate each seed, restoring the original seed afterwards; or (3) using random.org
    API (requires internet connection) to directly generate each seed (also restoring the original seed afterwards). -parallel- saves a macro
    with the used seeds in the r(pll_seeds) macro.



    WARNINGS For each child process -parallel- starts a new Stata instance (thus running as many processes as child processes), this way,
    should the user set more child processes than cores the computer has, it is possible that the computer freezes.



3. Details



    Inspired by the R library ``snow'' and to be used in multicore CPUs , -parallel- implements parallel computing methods through an OS's
    shell scripting (using Stata in batch mode) to speedup computations by splitting the dataset into a determined number of child processes in
    such a way to implement a data parallelism algorithm.



    The number of efficient computing child processes depends upon the number of physical cores (CPUs) with which your computer is built, e.g.
    if you have a quad-core computer, the correct child process setting should be four. In the case of simultaneous multithreading, such as
    that from Intel's hyper-threading technology (HTT), setting -parallel- following the number of processors threads, as it was expected,
    hardly results into a perfect speedup scaling. In spite of it, after several tests on HTT capable architectures, the results of
    implementing -parallel- according to the machines physical cores versus its logical cores shows small though significant differences.



    -parallel- is especially handy when it comes to implementing loop-based simulation models (or simply loops), Stata commands such as reshape
    , or any job that (a) can be repeated through data-blocks, and (b) routines that processes big datasets (see the append section).
    Furthermore, the commands -parallel bs- and -parallel sim- are specially designed to easily implement bootstrapping and (monte carlo)
    simulations in parallel fashion.



    At this time -parallel- has been successfully tested in Windows, Unix and MacOS for Stata versions 11 to 14.



    -parallel- does not change the RNG state (even if subcommands invoke randomization functions).



    After several tests, it has been proven that--thanks to how -parallel- has been written--it is possible to use the algorithm in such a way
    that other techniques of parallel computing can be implemented; such as Monte Carlo Simulations, simultaneously running models, etc.. An
    extensive example through Monte Carlo Simulations is provided here.



    To distribute work across different machines in a computer cluster, the machines need to be Linux/MacOS, share a global file-system (e.g.
    NFS), and have a non-interactive way to remotely execute commands.  The most common way to remotely execute commands is to use ssh with
    keyfiles so that no password is needed.  This is still a new feature, and synchronizing across machines in child processes can have odd
    corner cases, so users may encounter some trouble getting this to work.



4. Parallel Append



    Imagine we have several dta files named -income.dta- stored in a set of folders ranging 2008_01 up to 2012_12, this is, a total of 60 files
    (12 times 5) monthly ordered which may look something like this:



         2008_01/income.dta
         2008_02/income.dta
         2008_03/income.dta



         ...more files...



         2010_01/income.dta
         2010_02/income.dta
         2010_03/income.dta



         ...more files...



         2012_10/income.dta
         2012_11/income.dta
         2012_12/income.dta



    Now, imagine that for each and every one of those files we would like to execute the following program:



         program def myprogram
                 gen female = (gender == "female")
                 collapse (mean) income, by(female) fast
         end



    Instead of writing a forval/foreach loop (which would be the natural solution for this situation), -parallel append- allows us to smoothly
    solve this with the following command.



        . parallel append, do(myprogram) prog(myprogram) ///
                e("%g_%02.0f/income.dta, 2008/2012, 1/12")



    Where element by element, we are telling parallel:
        (1) do(myprogram): execute the command -myprogram-,
        (2) prog(myprogram): -myprogram- is a user written program, and
        (3) e("%g_%02.0f/income.dta, 2008/2012, 1/12"): this should process files 2008_01/income.dta up to 2012_12/income.dta.



    Besides of the simplicity of its syntax, the advantage of using -parallel append- lies in doing so in a parallel fashion, this is, instead
    of processing one file at a time, -parallel- manages to process these files in groups of as many files as child processes are set.
    Step-by-step, what this command does is:






       1. Distribute groups of files across child processes



        Once each child process starts, for each dta file



       2. Opens the file using [if] [in] accordingly to in and if options.
       3. Executes the command/dofile specified by the user.
       3. Stores the results in a temp dta file.



        Finally, once all the files have been processed



       4. Appends all the resulting files into a single one.






5. Caveats



    When the -stata_cmd- or -do-file- saves results, as -parallel- runs Stata in batch mode, none of the results will be kept. This is also
    true for matrices, scalars, mata objects, returns, or whatever other object different from data.



    Although -parallel- passes-through programs, macros and mata objects, in the current version it is not capable of doing the same with
    matrices or scalars.  The tempname internal state is copied to childre, but the parent does not receive any of this state from the
    children.  That is, -parallel- advances the tempname (tempvar) sequence in the children to not overlap with any produced by the parent.
 
    If the number of tasks to be done is less than the number of child processes, parallel will temporarily reduce the number of child
    processes. This is reported in the global $LAST_PLL_N.
 
    Expressions run in the child-processes that contain _n or _N will be evaluated locally to the child not the parent dataset.  These
    expressions may therefore be different if run in parallel than without parallel.
 
    When executing Stata on separate machines via ssh, no environment variables except PWD and STATATMP are copied over.






6. Technical note



    In order to protect a pll_id code (and thus ancillary files), once -parallel- is called it creates a new file called __pll[pll_id]sandbox
    (stored at c(tmpdir)). This forbids -parallel clean- from deleting any auxiliary file
    used by that process and reserves the pll_id so that no other call of -parallel- can use this pll_id. Once every child process has
    finished, the sandbox file is removed, freeing the pll_id.



    If for any reason the algorithm breaks due to a flaw or crush of the system, the sandbox file and the rest of auxiliary files will not be
    deleted. In order to clean up this, the user will be able to do so manually (moving the file(s) to the OS recycle bin) or using parallel
    clean, all force syntax. This way all sandbox files in the c(tmpdir) folder and auxiliary files stored at the current directory will be
    deleted.



    In earlier versions of -parallel-, tempfiles generation was not safe as while running multiple Stata instances simultaneously these could
    overwrite each other's tempfiles. Starting version 1.14, this is no longer a problem as each Stata instance starts with a different
    c(tmpdir) location.  This way, instances' tempfile management will not interfere with each other, allowing to safely use commands or
    algorithms depending on tempfile generation (such as preserve and restore).



    The option -setparallelid- is designed to let programmers recycle a parallel id (pll_id). Intended to be used with -parallel_sandbox-
    (undocumented, please refer to the source code of -parallel_sandbox()-), this option allows calling parallel several times using the same
    pll_id, which makes auxiliary files management far simpler. Take the following example



         program def mypllwrapper
                
                 // Reserving a pll_id
                 m: parallel_sandbox(5)
                
                 // Using the generated pll_id
                 save __pll`parallelid'_mypllwrapper, replace
                
                 // Recycling the pll_id
                 forval i=1/10 {
                         parallel, setparallelid(`parallelid') keep: some_other_cmd
                 }
                
                 // Cleanning up and freeing the pll_id. This will remove all files
                 // and folders named with prefix '__pll[parallelid]'
                 parallel clean, e(`parallelid')
                 m: parallel_sandbox(2,"`parallelid'")
                
         end



    For a real example of this, please see -parallel.bs- and -parallel_sim.ado-.



Windows-shell: Spawning child processes with shell command on Windows (Deprecated)



    Originally child processes on Windows were spawned as they were on other platforms using Stata's shell methods (e.g. winexec).  This had a
    number of problems (spawned processes stole the UI focus, failure to recover from killed child processes, difficulty in batch-mode), so now
    Windows uses a plugin that launches the child processes directly using Win32 system calls.  The original functionality is retained, but
    deprecated. To enable it you must specified the procexec(0) option.



    Since shell commmands are ignored by Stata in batch-mode on Windows, a work around is needed. The method is to have Stata write out the
    commands to be executed to a file (called the gateway) and have a separate process read new inputs to this file and execute the commands.
    This latter part requires the user to install Cygwin and run a few commands prior to starting Stata. In a Cygwin terminal, navigate to the
    appropriate directory and do the following:



        $ rm pll_gateway.sh
        $ touch pll_gateway.sh
        $ tail -f pll_gateway.sh | bash



    Then you can execute your Stata script in batch-mode on Windows. The Cygwin tail process can stay running through multiple uses.



    The default gateway file assumed is pll_gateway.sh. If you would like a different file modify the Cygwin script above and pass a new value
    for gateway(gateway_path) to parallel initialize.



    Since Cygwin is going to execute the commands to start the parallel Stata instances it needs a Cygwin-like Stata path. If the user does not
    specify the Stata path then -parallel- will take the generated windows path and convert it to "/cygdrive/<drive letter>/...".  If this does
    not work you will need to specify the statapath explicitly.



    In this mode, there is no automatic way for the parent process to stop the child processes in case the user has requested a break in
    execution.  The original (but now deprecated) parallel break can still be used (and mata equivalents parallel_break() and
    _parallel_break()).  This is a call that is you write into the code that executes in the children that queries if the mother process has
    requested to break.  If this is not used appropriately, and a child process is executing for a long period (e.g. an endless loop) the user
    must kill the child processes manually.






Example 1: using prefix syntax



    In this example we'll generate a variable containing the maximum blood-pressure measurement (bp) by patient.



    Setup for a quad-core computer
        . sysuse bplong.dta
        . sort patient
        
        . parallel initialize 4



    Computes the maximum of bp for each patient. We add the option by(patient) to tell parallel not to split stories.
        . parallel, by(patient): by patient: egen max_bp = max(bp)
        
    Which is the ``parallel way'' to do:



        . by patient: egen max_bp = max(bp)
        
    Giving you the same result.



        
 Example 2: using -parallel do- syntax



    Another usage that may get big benefits from it is implementing loop-base simulations. Imagine that we have a model that requires looping
    over each and every record of a panel-data dataset.



    Using -parallel-, the proper way to do this would be using the ``parallel do'' syntax



        . use mybigpanel.dta, clear



        . parallel initialize 4
        . parallel do mymodel.do
        
        . collapse ...



    where mymodel.do would look something like this
        
        ----------------------------------- begin of do-file ------------
                local maxiter = _N
                forval i = 1/`maxiter' {
                        ...some routine...
                }
        ----------------------------------- end of the do-file ----------



    Or, in the case of using mata, this would look something like this



        ----------------------------------- begin of do-file ------------
                mata:
                N=c("N")
                for(i = 1;i<=N;i++) {
                        ...some routine...
                }
        ----------------------------------- end of the do-file ----------



Example 3: setting the right path



    In the case of -parallel- setting the stata.exe's path wrongly, using -setstatapath- will correct the situation. So, if "C:\Archivos de
    programa\Stata12/stata.exe" is the right path we only have to write:



        . parallel initialize 2, s("C:\Archivos de programa\Stata12/stata.exe")






Example 4: Using -parallel bs-



    In this example we'll evaluate a regression model using bootstrapping



    Setup for a quad-core computer
        . sysuse auto, clear
        
        . parallel initialize 4



    Running parallel bs.
        . parallel bs: reg price c.weig##c.weigh foreign rep
        
    Which is the ``parallel way'' to do:



        . bs: reg price c.weig##c.weigh foreign rep






Example 5: Using -parallel sim-



    Example from simulate



    Setup for a quad-core computer
        . parallel initialize 4



    Experiment that will be performed
        program define lnsim, rclass
                version 17
                syntax [, obs(integer 1) mu(real 0) sigma(real 1) ]
                drop _all
                set obs `obs'
                tempvar z
                gen `z' = exp(rnormal(`mu',`sigma'))
                summarize `z'
                return scalar mean = r(mean)
                return scalar Var  = r(Var)
        end



    Running parallel sim.
        . parallel sim, expr(mean=r(mean) var=r(Var)) reps(10000): lnsim, obs(100)
        
    Which is the ``parallel way'' to do:



        . simulate mean=r(mean) var=r(Var), reps(10000): lnsim, obs(100)






Example 6: Using -pll_instance- and -PLL_CHILDREN- macros



    By using -pll_instance- and -PLL_CHILDREN- global macros the user can run -parallel- in such a way that each child process performs a
    different task. Take the following example:



    Setup for a quad-core computer
        . parallel initialize 4
        . sysuse auto, clear



        program def myprog
                gen x = $pll_instance
                gen y = $PLL_CHILDREN
        
                // For the first child process
                if ($pll_instance == 1) gen z = exp(2)
        
                // For the second child process
                else if ($pll_instance == 2) {
                        summ price
                        gen z = r(mean)
                }
        
                // For the third and fourth child processes
                else gen z = 0
        end



    Running the program
        . parallel, prog(myprog): myprog



    Here, running with 4 cores, the program -myprog- performs different actions depending on the value (number) of -pll_instance-. For those
    observation in the first child process, -parallel- will generate -z- equal to exp(2), for those in the second child process it will compute
    -z- equal to the average price and for the rest of the child processes it will generate -z- equal to zero.



8. Saved results



    -parallel- saves the following in r():



    Scalars        
      r(pll_n)            Number of parallel child processes last used
      r(pll_t_fini)       Time took to appending and cleaning
      r(pll_t_calc)       Time took to complete the parallel job
      r(pll_t_setu)       Time took to setup (before the parallelization) and to finish the job (after the parallelization)
      r(pll_errs)         Number of child processes which stopped with an error.



    Macros         
      r(pll_id)           Id of the last parallel instance executed (needed to use parallel clean)
      r(pll_dir)          Directory where parallel ran and stored the auxiliary files.
      r(pll_seeds)        Seeds used within each child process.






    -parallel bs- and -parallel sim- save the following in e():



    Scalars        
      e(pll)              1.






    -parallel version- saves the following in r():



    Macros         
      r(pll_vers)         Current version of the module.



    -parallel numprocessors- saves the following in r():



    Scalars        
      r(numprocessors)    Number of logical processors on the system.






    -parallel- saves the following global macros:



      $LAST_PLL_DIR       A copy of r(pll_dir).
      $LAST_PLL_N         A copy of r(pll_n).
      $LAST_PLL_ID        A copy of r(pll_id).
      $PLL_LASTRNG        Number of times that -parallel_randomid()- has been executed.
      $PLL_STATA_PATH, $PLL_CLUSTERS (deprecated), $PLL_CHILDREN, $USE_PROCEXEC, $PLL_HOSTNAMES, $PLL_SSH
                            Internal usage.






9. Citation
    When using parallel, please include the following:



    Vega Yon GG, Quistorff B. parallel: A command for parallel computing. The Stata Journal. 2019;19(3):667-684. doi:10.1177/1536867X19874242



    For a bibentry, checkout the parallel citation command.



10. Development



    You can always have access to the latest version of -parallel-. One option is from its github repo (on-development source code):



        https://github.com/gvegayon/parallel



    Or from the project's website:



        . net install parallel, from(https://raw.github.com/gvegayon/parallel/master/) replace
        . mata mata mlib index






    You can track new releases on GitHub or by following the RSS feed https://github.com/gvegayon/parallel/releases.atom






    In the case of bug reporting, you can submit issues here:



        https://github.com/gvegayon/parallel/issues



    Please try the latest version to see if your problem has been solved.  Include the steps to reproduce the issue and the output of the Stata
    command -creturn list-.






11. mata source code



    Most of -parallel- has been programmed in mata. This means that, as a difference from typical ado files, -parallel- is distributed with
    lparallel mata library (compiled code) and thus source code can not be reached directly by users. Given this, the help file
    parallel_source.sthlp is included in the package, help file which contains the source code in a fancy way.



    In order to get access to different sections of the source code you can follow these links:



        Stops a child process after the user pressed break  parallel_break.mata
        Remove auxiliary files                            parallel_clean.mata
        Distributes observations across child processes   parallel_divide_index.mata
        Export global macros                              parallel_export_globals.mata
        Export programs                                   parallel_export_programs.mata
        Wait until a child process finishes               parallel_finito.mata
        (on development)                                  parallel_for.mata
        Normalize a filepath                              parallel_normalizepath.mata
        Generate random alphanum                          parallel_randomid.mata
        Lunch simultaneous Stata instances in batch mode  parallel_run.mata
        Set of tools to protect parallel aux files        parallel_sandbox.mata
        Set the number of child processes                 parallel_initialize.mata
        Set the Stata EXE directory                       parallel_setstatapath.mata
        Write a ``diagnosis''                             parallel_write_diagnosis.mata
        Write a dofile to be paralellized                 parallel_write_do.mata






12. References



    Luke Tierney, A. J. Rossini, Na Li and H. Sevcikova (2012). snow: Simple Network of Workstations. R package version 0.3-9. 
        http://CRAN.R-project.org/package=snow
    R Core Team (2012). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. ISBN
        3-900051-07-0, URL http://www.R-project.org/.
    George Vega Y (2012). Introducing PARALLEL: Stata Module for Parallel Computing. Chilean Pension Supervisor, Santiago de Chile, URL 
        http://fmwww.bc.edu/repec/bocode/p/parallel.pdf.
    George Vega Y (2013). Introducing PARALLEL: Stata Module for Parallel Computing. Stata Conference 2013, New Orleans (USA), URL 
        http://ideas.repec.org/p/boc/norl13/4.html.
    Haahr, M. (2006). Random.org: True random number service. Random.org. http://www.random.org/clients/http/.






13. Authors



    George Vega Yon [cre,aut], University of Southern California. mailto:g.vegayon@gmail.com http://ggvy.cl/



    Brian Quistorff [aut], Bureau of Economic Analysis. mailto:brian-work@quistorff.com http://quistorff.com



14. Contributors



    Special Thanks to:  Elan P. Kugelmass (aka as epkugelmass at github) [ctb], Timothy Mak (University of Hong Kong) (author of miparallel)



    Damian C. Clarke (Oxford University, England), Felix Villatoro (Superintendencia de Pensiones, Chile), Eduardo Fajnzylber (Universidad
    Adolfo Ibáñez, Chile), Eric Melse (CAREM, Netherlands), Tomás Rau (Universidad Católica, Chile), Research Division (Superindentendia de
    Pensiones, Chile), attendees to the Stata conference 2013 (New Orleans), Philippe Ruh (University of Zurich), Michael Lacy (Colorado
    State).






15. Also see



    Manual: [GSM] Advanced Stata usage (Mac), [GSU] Advanced Stata usage (Unix), [GSW] Advanced Stata usage (Windows)



                
    Online: Running Stata batch-mode in  Mac, Unix and Windows



    Project's wiki page of other examples.






16. FAQs



    Here follows a list of Frequently Asked Questions:






     1.  I am getting error (608) file is read-only; cannot be modified or erased. What can I do to solve it?



         As Stata suggests, you are trying to either run parallel in a read-only directory, or your program/dofile is trying to write (save a
         dta file for example) in a read-only directory. Try running parallel (or making your program to write files) in a directory where you
         have writing priviledges (where you can save files).






     1.  How can I create reproducible results between sequential and parallel excecution when randomness is involved?



         See our utility command seeding.