An analysis of the primary factors influencing the performance of a parallel implementation on a Cray T3E of a Carbon Molecular Dynamics code developed at Department of Physics and Astronomy at Michigan State University is presented. We show that classical load-sharing techniques combined with careful analysis of Amdahl's law can be successfully used to significantly increase the performance of the code. This report describes the quantitative analysis of these factors and the solutions used to diminish or eliminate their effects. By slightly modifying the code we reduced its sequential portion to less than 0.1%. We also demonstrate that the MPI collective communications implementation on the Cray T3E dramatically reduces the communication overhead for our code. In the end, a speedup of 170 was obtained using 256 Cray T3E processing elements. These results create the prospect of simulating the dynamics of 1,000-atom nanotubes in the microsecond regime (≈1,000,000 time steps).