With the onset of the COVID-19 pandemic, the focus of medical research has been on tracing the origin of the SARS-CoV-2 infection, which was first reported in China’s Wuhan city, at the end of 2020.

SARS-CoV-2 Clades

Image Credit: https://www.biorxiv.org/content/10.1101/2020.10.02.323519v1.full

By June 2020, all African countries had reported cases of COVID-19. A recent study reported in the preprint server bioRxiv* in October 2020 follows the spread of the virus from China to Europe and then to West Africa.

The sequence of the viral RNA has changed considerably over its spread, and these have led to the emergence of SARS-CoV-2 clades. These are found to be specific to different regions.

Earlier studies described the possible effect of the D614G mutation that is linked to clade G, as being linked to the higher mortality in the US East coast states relative to the West coast. The mechanism of increased virulence is thought to be due to the increased adherence of the virus to the cell membrane, possibly because of a hydrogen bond between the spike protein S1 unit and the S2 region on the cell membrane. Some evidence of this has come from mouse studies.

Other researchers have shown the rapid rise to the dominant position of clade G in any region where it makes an entry, as it replaces all other clades with the D614 amino acid. In the middle of March 2020, this clade was confined mostly to Europe but was quickly introduced to other countries.

Of four Chinese G clade samples, one had the A23403G substitution which induces the D614G mutation, but the other two commonly associated mutations were lacking. Three with the D614G mutation were traced back to the German sample first sequenced in January 2020. This sample has the A23403G mutation and a C-to-T mutation at position 3037 but the one at position 14408 is missing.

The earliest sample to have all three mutations as well as another one in the Untranslated Region (UTR) was found in Italy. Also, a C14408T mutation found in the G clades next to the sequence encoding the RNA dependent RNA polymerase (RdRp) is thought to increase the rate of mutation.

A phylogenetic tree constructed for the viruses circulating in France showed that the first virus introduction did not cause community spread, and clade G was introduced long before the first case was recorded. This belonged to the same clade but occurred in a patient without a history of travel or contact with a traveler.

One of the advantages of the African population concerning this virus is the very large proportion of young people, with a median age of ~20 years in Sub-Saharan Africa, who are expected to have mild disease. Secondly, the risk of transmission from China via air traffic was found to be quite unlikely during the early outbreak, except for South Africa and Ethiopia. In the later phase, however, researchers have found that containment measures are essential given the high infection rates that are estimated.

The current study explores the region of Africa from where the first SARS-CoV-2 sequences were isolated, to construct a phylogenetic tree. This could help understand how the virus was transmitted between countries and whether different clades have varying severities.

The sequences being studied come from the West African countries of Gambia, Ghana, Nigeria, and Senegal. These are very similar to those from China as well as from Europe.

The phylogenetic tree is shown to divide into two at the level of the D614G mutation, that indicates the importance of this single-nucleotide point mutation in viral diversity. The lower branch thus lacks this mutation and is linked to the earliest Wuhan sequences, containing the Nigerian viral sequences as well.

The top branch has the sequences circulating in Europe, shared by those from Senegal. Samples from Ghana cluster equally with both branches. The three Gambian samples belong to three different clades (V, GR, and GH).

Clade-Mutation Association

The researchers found that the samples from West Africa are found in all the clades, which could be because they were introduced from China and European countries equally. However, the samples from each country show a characteristic cluster distribution as shown above.

The S clade has two important mutations, C24370T and G22468T, which appear to be specific to West Africa. The Ghana samples cluster in the branch emanating from the former mutation, while the latter may give rise to a branch that is the result of the migration of the virus, as shown by the presence of Mali and Tunisia strains in this branch.

Senegal strains related to the French sequences are found to arise from both these branches, while others are closer to those from Spain at the end of February, belonging to early clade S. This indicates that the virus was introduced multiple times into Senegal from France, Spain and other regions of Africa.

Timeline of Clade Distribution

When the clades are mapped concerning time, it is obvious that the European sequences of the G clade play a major role in the African outbreak, including the D614G mutation. An unexpected pattern was seen in which the G clades belonging to the later part of the pandemic in Europe, namely, G, GH, GR, are seen to have been in circulation before those belonging to the early European outbreaks, namely, L, S, and V.

One explanation is that the French strains introduced into Senegal displayed the founder effect, being closely related to the Senegalese strains and with a similar clade distribution. Another is that migration and travel routes affected the circulation of the various strains, as seen in the first case reported in Nigeria, which originated in Italy. The earlier clades may have come by a slower route, for instance, by ship, while the later clades traveled faster and reached these countries earlier.

Again, the sample sequences from the early European clades came from the middle of March at which point the Chinese share of the epidemic was drawing to an end, indicating the possibility that the virus had passed through and within several countries before the ‘early’ samples were sequenced.

An unexpected finding was that the S clade is very abundant mainly because of the strains from Nigeria and Ghana. When these strains are excluded, the pattern of strain abundance is close to that seen worldwide, with a lag of 2-4 weeks.

Clade Distribution by Country

The early clades seem to have spread mostly in Nigeria which yielded the highest proportion of these sequences. Ghana-sourced samples come almost equally from China and Europe, while those of Senegal come mostly from France with a few early Chinese clades.

Gambia had two sequences from Europe and one from China, a pattern that resembles the Italian spread, but where the G clade is replaced by the GH clade. This change, however, associates it with Chinese and European clades. The Gambian distribution is similar to the UK pattern, which is also, however, related to that of the Ghana clade distribution because of the presence of China clades.   

Finally, the US West Coast and East coast states show different distributions, with the former being similar to the Nigerian pattern, containing a large proportion of clades from China, but the latter yielding more GH clades like French and Senegalese strains.

Geographic Distribution

The map shows that the China-based clades L, S, and V spread through Europe and the West coast of the US and make up a high percentage of Nigerian strains. These clades are found along with the later European clades G, GH, and GR in almost equal proportions, like the West coast. Senegal has the same pattern of viral distribution as France, but some exceptions belong to the early China-based clades, making for some similarities with the East coast.

The researchers propose that this pattern arises from a combination of two explanations: the early clades may not have been active during the early period of viral circulation in West Africa, or they may have been introduced later than the early strains. The later clades may also have been more virulent, leading to their earlier detection.

The conclusion is that while Senegal and Gambia hosted multiple introductions from Europe, due to a low level of air traffic from and to China. Ghana and Nigeria received the virus from Europe, as well as from China both directly and indirectly via Asian or European countries.

Most cases from which the strains were retrieved for sequencing came from the period when Wuhan was still locked down, between January 23rd and April 8th, 2020. Possibly, investigators say, the strains from the very early clades were transmitted very early in the pandemic, or via other countries, or from provinces in China other than Wuhan.

As in other regions, the researchers predict that the later G clades will become predominant in Nigeria and Ghana, but it is not clear whether this will also lead to higher mortality. Some researchers feel that the D614G mutation is linked to a higher case fatality rate, but others say the evidence points to higher infectivity and a higher viral load but not increased disease severity.

As of now, the case fatality in these countries is low, at 0.6 in Ghana but 3.2 in Gambia. The researchers hint that sunlight, vitamin D levels, and the climate, may also play a role in the outcome following infection with the virus. Again, some researchers predict that the continuous occurrence of mutations will eventually cause the virus to become endemic with a low mortality rate.

The researchers hope this pioneering study will be followed by more wide-ranging analyses using a larger number of sequences. However, the G clades in West Africa do not appear to correlate with mortality, thus “disproving fears that the pandemic would massively overwhelm the health systems in Africa.”

It is not yet time to relax, despite this, and measures to prevent future outbreaks must balance both the financial stability of the region and the necessity to protect the health of the inhabitants.

*Important Notice

bioRxiv publishes preliminary scientific reports that are not peer-reviewed and, therefore, should not be regarded as conclusive, guide clinical practice/health-related behavior, or treated as established information.

Journal reference:

  • Wruck, W., and Adjaye, J. (2020). Transmission of SARS-COV-2 from China to Europe and West Africa: A Detailed Phylogenetic Analysis. bioRxiv preprint. doi: https://doi.org/10.1101/2020.10.02.323519. https://www.biorxiv.org/content/10.1101/2020.10.02.323519v1