Bibliometric analysis and visualization of translation assessment: Research theme, evolution and hotspots

: Translation assessment, referring to evaluating various aspects related to translation, is crucial to the improvement of translation competence and quality. This study, taking valid papers on translation assessment published in WoS core collection from 2000 to 2022 as research samples, visualizes and reviews the research theme, research evolutions and emerging hot topics of translation assessment through bibliometric analysis. The number of annual publications shows that despite the general growth, the number of articles on translation assessment fluctuates from year to year. By Cite Space-based analysis of keywords co-occurrence, keyword clustering, time zone and burst detection, it finds that translation assessment mainly covers five themes, namely translation competence, translation quality, machine translation, translation teaching and training, and others. The overall evolution has been a gradual shift from concepts, metrics and evaluations of topics related to translation assessment into a topic of in-depth and interdisciplinary study. The hotspots in recent years include translation competence acquisition, neural machine translation and translation quality estimation. The findings reveal research evolutions and hot spots, propose implications for further translation assessment research and provide references for scholars who are interested in this topic.


Introduction
Translation is the delivery of meaning in one language into another by the translator, which is an indispensable carrier in cross-culture communication (Nida, 2003).With the development of globalization, translation has become increasingly prominent in its role of communication, and the demand for translators and interpreters is becoming more and more urgent (Bai, 2018).Translation assessment (TA) usually refers to the use of various materials and information related to translation to assess translation ability and level.It is like figuratively sitting together with learners and collecting information about their learning process (Kiraly, 2000).In addition to the collection of qualitative and quantitative data, assessment also includes the interpretation of data, the synthesis of various data and finally the comparison and judgment according to the teaching objectives (Xiao, 2012).In short, assessment is the process that aids decision-making by gathering, synthesizing, and interpreting information (Airasian, 1997).
Translation assessment is a fruitful research field in Translation Studies (Han et al., 2022).It is very important because assessment with fine structure and reasonable contents is crucial to the enhancement of translation competence and quality (Nikolaeva and Korol, 2021;Akulina and Tikhona, 2021).Besides, assessment can be viewed as a pedagogical activity.It is instructive for teaching because it gathers information about learners to detect the individual strengths and weaknesses, then guide them to move in the expected direction (Dickson et al., 2020).Brunette (2000) also notices that different translation scholars interpret the term translation assessment inconsistently, depending on their theoretical positions.Therefore, there are a number of studies about translation assessment, for instance, mixed researches (Nikolaeva and Korol, 2021;Korol, 2020;Korol, 2021), quantitative researches (Akulina and Tikhonova, 2021;Su, 2022;Han et al., 2022;Angelone, 2020) and qualitative researches (Han, 2020;Huang and Xin, 2020).However, there is a lack of systematic reviews on translation assessment despite many studies, let alone any using bibliometric analysis and visualization, such as research hotspots and trends.Systematic review makes it possible for us to know the structure of a field and to regard the field in its entirety (Tymoczko, 2005).
Bibliometric analysis (BA) is defined as a method to evaluate and monitor the progress of given area with mathematical and statistical techniques (Yilmaz, 2019).It can offer quantitative analysis for publications in written material (Ellegaard and Wallin, 2015).The review of studies over periods can not only discover the distributed architecture features and patterns, but also monitor trends in relevant areas and assess these trends as well as future research.Bibliometrics has proved to be a significant tool for reviewing the development in TA (Gile, 2000).By using the bibliometric approaches, we can vividly and visually describe the latest progress, frontier topics and existing gaps in a certain research subject field (Guo et al., 2019).CiteSpace is an automation software for BA developed by Professor Chen Chaomei, which is extensively used to visualize trends and patterns in scientific literature.As a all-in-one software, it integrates functions like drawing visual co-citation map, locating inflection point, searching key node and analyzing regional evolution (Chen, 2006).
This paper analyses publications in the WoS core collection from 2000 to 2022 related to TA using bibliometric techniques and Citespace, aiming to examine four questions: 1) What are the overall trend of TA from 2000-2022?
2) What research themes have emerged on TA from 2000 to 2022? 3) How did research on TA evolve?4) What are the hotspots of TA research?
By answering the four questions, topics, citations and keywords of the published literature about translation assessment can be visually analyzed, research theme and evolution can be scientifically explored, current emerging hot spots can be revealed, which help propose implications for further research and provide references for scholars who are interested in this topic.

Data source and methodology
When analyzing information for a research, the top priority lies in its richness and neutrality.The dataset for this article comes from the Web of Science (WoS), which is considered the most important and commonly utilized scientific database in various areas (Pranckute, 2021;Janmaijaya et al., 2018).With a strict screening mechanism, WoS is a valuable database for acquiring academic information from around the world.Meanwhile, it merely contains influential academic papers from different subjects, based on Garfield's law of concentration in bibliometrics (Pan et al., 2019).Though there are many other databases, such as Google scholar, the whole attention will be centered on WoS in this study.
Translation studies is not a discipline featured by the compactness of terms (Marco, 2007), and the related publications are not always topically clear-cut (Huang and Xin, 2020).The interpretation of the term "translation assessment" by different scholars is considerably inconsistent, depending on their theoretical standpoints (Brunette, 2014).Therefore, "translation assessment" was a fuzzy concept and used in ways that were not always consistent.The very act of translation assessment sometimes is referred to other synonymous terms or related phrases, such as "translation evaluation" (Chapelle and Brindley, 2010;Xiao, 2012;Maier, 2014;Yang, 2019), "translation criticism" (Palumbo, 2009;Munday, 2016), "translation analysis" (McAlester, 1999) and "translation revision" (Arango-Keeth and Koby, 2003;Karoubi, 2016).However, some scholars point that there existing some differences among these terms, for example, the term evaluation refers to translation competence assessment, namely the process of translation evaluation for teaching purposes, while the term assessment applies to translation quality assessment, validating the suitability of the translation as a product to be presented to the customer (Arango-Keeth and Koby, 2003).
After consulting correlated literature and several translation experts, the retrieval strategy is set as TS = ("translation assessment" OR "translation evaluation" OR "translation criticism" OR "translation near/1 assessment" OR "translation near/1 evaluation*"), which takes the concentration and coverage of the literature into consideration.They are meaningful words that can represent the topic and cover the studies related to translation assessment to guarantee the retrieval accuracy.The literature sources were not streamlined in order to embrace the related interdisciplinary literature in this research.The datum was retrieved from the core collection of Web of Science (WoS) on March 28, 2023.The period is from 2000 to 2022, because the related studies before 2000 were limited and rare.The total number of publications is 1460.
1460 recorders were reviewed to select the representative and accurate ones as well as to remove the irrelevant and duplicate ones.Ultimately, 990 valid literature recorders were retained as data samples and saved in plain text format for bibliometric analysis.Figure 1 shows the research design of this paper.
In stage 1, 990 literatures related to translation assessment were identified.Among which, the document types include: article, review and proceedings papers."article" and "proceeding papers" are the main document types.
There are rich information involving in the exported records from WoS, for instance, authors, title, abstract, publication year, and references etc.Based on the adequate information from Stage 1, the bibliometric analysis and information visualization can be accomplished effectively in Stage 2. The internationally widely-use bibliometric analysis software, Cite-Space, developed by Chaomei Chen was chosen in this research.It is a comprehensive research tool with which the maximum use can be made of the information contained in the literature for a systematic and time-line analysis of previous research, as well as use titles, abstracts and keywords to predict its future developments (Che et al., 2022).In this study, keywords co-occurrence, keywords and cited literature bursts were accomplished with Cite-Space.
Stage 3 comes with some conclusions and future work.Therefore, this paper will explore translation assessment by BA to analyze papers taken from the Web of Science Core Collection.Research findings contain publication trend, cluster analysis, keyword co-occurrence (time zone) and burst detection to demonstrate the evolution and hotspots.

Descriptive analysis
Descriptive statistics can be utilized to indicate the overall trends of variables and carry out data mining (Pan et al., 2019).Counting the amount of academic literature in a discipline is one way to measure its degree of research development.By drawing and analysing the time distribution of the number of references, not only the research status of this discipline can be appraised efficiently, but also the development vitalities and trends would be expected (Diao et al., 2022).In brief, the publication number is an important indicator reflecting the development trends of scientific research (Guo et al., 2019).
The annual publication volume of journals from 2000 to 2022 was calculated (see Figure 2).On the whole, 990 papers published during the past 22 years, with an average annual publication of 45 papers.Before 2000, there are very few articles on translation assessment with intermittent publication year.Since 2000, related articles show an increasing trend.The Translator started a special issue on "Evaluation and Translation" devoting to the evaluation and quality of translation in 2000 (Maier, 2000), might have been partially responsible for the articles' appearance on translation assessment at this time.In its introduction, it says that conversations about the quality and value of translation have always been particularly bothering.Nowadays, however, the rising demand for translation in areas such as business and technology, requiring specialized translations as well as a more scientific approach to value determination.This is the reason why the special issue was set up.
According to the line graph, translation assessment research can be allotted into the following three stages: (1) From 2000 to 2007, as the initial period, a total of 88 papers were published, with an average annual number of 11 papers.The number of papers was a few and the growth was slow.During this period, there have been 2 pullbacks, almost all of them accompanied by rallies, the fastest one is in 2005.In 2005, papers related to machine translation appear.(2) During the stable period from 2008 to 2013, a total of 171 papers were published, with an average annual number of 19 papers.Although accompanied by a certain degree of fluctuation, but the overall trend is upward.In 2014, the number of papers reached a peak of 36 papers, which increased significantly.(3) From 2014 to 2022, for the fast developed period, a total of 726 papers were published, with an annual average of 81 papers, and translation assessment research tended to be stable.Although there was a decline in 2015, it grew rapidly in the following year and maintained a stable number, which indicates to some extent that translation assessment research has received continuous attention from a certain number of scholars and formed its own relatively independent research field.

Research themes
Cluster analysis refers to an exploratory data mining technique for identifying and analyzing the classification of significant terms and context within a certain research field (Si et al., 2019).The gathered data is transformed into several structured clusters using a series of algorithms, then gain a deeper understanding of the research topics, such as the distribution and structures (Olawumi and Chan, 2018).
The foundation of cluster analysis relies on one of the most basic, simplest, and most often overlooked methods of understanding and learning, namely, grouping "objects" into "similar" groups, therefore, each subject in its group is more similar to other subjects than to subjects outside the group.This process involves many different algorithms and methods to make similar types of clusters.
In this paper, Cite Space 6.2R2 was used to cluster the keywords of 990 literatures, and LLR algorithm was used to identify 12 research clusters and the corresponding keywords of each cluster.The Log-Likelihood ratio (LLR), possessing the function of producing clusters with low inter-class similarity and high intra-class similarity, was the clustering technique.Thus, cluster analysis helps to classify large amounts of information into controllable units, and then deduces each cluster's information.
The past 20 years' data were obtained to analyze research themes.The time slice length was set at 2 years and the selection criteria g-index was k = 15.On the basis of LLR algorithm, cluster analysis shows 277 nodes, 266 connections, and 12 clusters.The modularity was Q = 0.8585, Weighted Mean Silhouette was S = 0.9863.The 12 research clusters and keywords are showed in Table 1, which are in order as follows: #0 translation competence, #1 translation quality, #2 statistical machine translation, #3 machine translation, #4 machine translation evaluation, #5 legal translation, #6 translation quality assessment, #7 translation evaluation, #8 neural works, #9 neural machine translation, #10 translator training, #11 translation strategies.Based on the clustering results, this paper integrates the above 12 research clusters with their related keywords, then summarizes 5 domains of translation assessment.There existing some duplicate and similar contents among the twelve clusters, as seen in Table 1, for example "statistical machine translation" "neural machine translation" and "machine translation evaluation", which are related to the cluster "machine translation", therefore, the four clusters are combined into one.Similarly, cluster "translation quality" and cluster "translation quality assessment" both focus on quality, thus they are synthesized into one cluster "translation quality".The "translation evaluation" cluster, as the research topic in this study, is eliminated.Thus, this paper, on the basis of cluster analysis results and through further study and analysis of literature, merges the above 12 clusters and keywords to derive five research themes, i.e., translation competence, translation quality, machine translation, translation teaching and training, and others, as shown in Figure 3.The first main cluster is translation competence, in which translator's competence evaluation, translation competence acquisition and competence levels are included.In this cluster, scholars mainly focused on the construct of translation competence, and translation competence assessment of the translators.
The second main cluster is translation quality.It includes rich contents serving for assessing translation quality, such as assessment tools and methods, rubrics and module, process and procedures, validity and reliability, literary translation and corpus-based translation, comparison between human translation and machine translation.
The third main cluster is machine translation, including key words such as quality estimation, audiovisual translation, speech translation, automated evaluation metrics, natural language processing, attention mechanism.
The fourth main cluster is translation teaching and training.This cluster focuses on assessment in education, such as academic performance, bilingual subcompetence, English for translation and interpreting, translation revision, translation techniques etc.In addition, how to assess and improve students' translation skills in teaching process is frequently discussed in this cluster.
The fifth main cluster is others, which includes legal translation, language and speech interfaces, culture-specific elements etc.

Research evolutions
There is a growing interest in tracking research topics.One of the methods is to visualize various time patterns and detect the trend on this basis (Erten et al., 2003).The typical words to depict the central content of a paper are key words, and the network analysis of keyword co-occurrence is helpful to detect the research concentration in a certain filed (Diao et al., 2022).This paper visualizes the keyword co-occurrence network with respect to time zone, as shown in Figure 4.The time span is from 2000 till now, which shows the whole process of the evolution.The search term of translation assessment come from the title, abstract and key words.The size of a node implies the frequency of keyword occurrences.The larger the node, the higher the frequency.In addition, keywords near the center of each nod represent the core topics for that time period.
Since 2000, key words occurred intensively.From 2000 to 2005, The main topics studied by scholars are translation quality and its assessment, translation competence, translation assessment, revision and translation error.In other words, the studies of translation assessment begin with translation quality assessment and translation competence assessment.Some scholars develop models for translation quality assessment (al-Qinai, 2000;House, 2001;Williams, 2001;etc.)and revisions for translation quality control (Chakhachiro, 2005), others come up with different approaches to translation evaluation (Waddington, 2001;Bowker, 2001;Li, 2001;etc.).The construct, acquirement, assessment of translation competence is also explored (Martí nez and Hurtado, 2001;Orozco and Hurtado Albir, 2002;Pym, 2003;etc.).
From 2005 to 2012, scholar's research themes gradually evolved into machine translation evaluation.Scholars concentrate on the evaluation metrics of machine translation and how to improve it (Yao et al., 2006;Zhu and Wang, 2006;Yang et al., 2008;etc.).Within machine translation theme, the statistical machine translation system was attached attention to Sadat and Habash (2006); Hassan et al. (2006); Schwenk and Estè ve (2008).During this period, although the research theme still covered translation assessment and translation evaluation, it was less compared with machine translation (Li, 2006;Albir, 2007;Colina, 2008;etc.).From 2012 to 2015, the research topic began to center around translation assessment in teaching and training.The translation competence acquisition and evaluation of EFL learners or translators were studied frequently (Göpferich, 2013;Pym, 2013;Ká roly, 2014;etc.).In addition, error patterns were identified and corpus were constructed to help assess the translations (Popescu, 2013;Lee and Ronowicz, 2014).
New research theme revolving around neural machine translation evaluation emerged from 2015 to 2018 (Guzmá n et al., 2017;Moorkens, 2018).With the rise of this topic, scholars also began to pay attention to the quality of machine translation (Burchardt et al., 2016;Castilho and O'Brien, 2017).Translation quality improvement through edition, revision and other methods are also studied during this period (Daems, et al., 2017;De Sutter et al., 2017;Liu et al., 2017;etc.).
To sum up, the research evolution of translation assessment from 2000 to 2022 can be summarized as below: (1) the rise in the number of participants.At first, scholars studying translation assessment mainly come from the filed of translation and linguistics.Later on, scholars possessing different research backgrounds, such as psychology, testing, pedagogy, computer science join to study on this topic.(2) The interdisciplinary nature of the research content is becoming increasingly apparent.Initially, translation assessment principally evaluating competence and quality, the former is closely related to pedagogy field, for instance, error analysis, post-editing, translator training; while the latter is closely linked to literature and linguistic filed.
With the emergence of machine translation, its assessment is closely connected to the fields of science and technology, such as computer, software and various algorithms, presenting a stronger interdisciplinary feature.(3) the development of research angle: in the early stage of machine translation, scholars concentrate on the evaluation metrics and its reliability.With the deepening of the research on machine translation, scholars shifted to the quality of machine translation and how to improve it.(4) For research methods, usually scholars utilize controlled experiments to compare the translation quality or translation competence.Since 2012, case study and survey are more popular.( 5) Half research about machine translation during before 2018 came from collected papers of international conference.This is probably because research on machine translation lies in the early stage and is the research hotspot at that time.Since 2018, research in this filed came from journals, indicating that translation assessment has received sustained attention from a certain number of scholars.

New hot spots
In order to explore the research hot spots, the keywords with high frequency as well as their centrality are firstly made statistics.According to the Table 2 below, translation competence, machine translation and translation quality are the top three keywords with high frequency and high centrality.In order to further understand the research hotspots of translation assessment, the burst detection of keywords and citations is carried out with the help of Citespace.
Keywords burst and cited literature burst provide evidence that specific keywords and citations are associated with spikes in occurrence frequency.Keywords burst appears when a topic has attracted or is attracting attention at a certain time from researchers, and cited literature burst indicates that the research circle has paid or is paying particular attention to the possibility of the paper's potential contribution (Diao et al., 2022).Therefore, burst detection, regarded as a sign of a highly active field of research, able to discover emerging trends as well as fleeting ones (Pollack and Adler, 2015).To explore the emerging hot spots in translation assessment, this paper conducted sudden detection of keywords and citations in the literature published in the past eight years, and extracted the detection results in the past five years based on the burst intensity, as shown in Table 3.According to the keyword burst detection results, the burst strength of translation competence acquisition and neural machine translation is the highest, with 2.28 and 2.22 respectively.The former shows that assessing the acquisition of one's translation competence is a focus in research field in recent years, while the latter shows that in machine translation evaluation system, neural machine translation has surpassed statistical machine translation and attracted increasing attention by researchers.other keywords like neural mt and translation quality burst since 2018, demonstrating researchers concerned about the evaluation of translation quality; translation competence, student translator and statistical machine translation began to burst in 2019, showing students' translation competence is a hotspot at that time and statistical machine translation still being explored by scholars.However, the burst duration time of the above key words last only one year.Since 2020, the burst duration time of keywords last more than one year, like neural machine translation, translator competence and neural network, indicating these areas received more sustained attention.
In terms of citations burst detection, the identified citations center on the machine translation field, signifying that TA is not only explored by linguistic and language field, but also by computer science filed.Classic papers about translation competence started a sustained burst since 2018, with the highest strength.Next is machine translation, started to burst in 2017.Among the 10 high-cited articles, 7 belong to empirical research in which experiment or case study is conducted to obtain related data, demonstrating empirical study is the mainstream in TS field in recent years.In addition, one article about machine translation comes from conference proceedings, which indicates the hotspot in this filed is discussed in conferences.By translation competence acquisition experiment, Beeby et al. (2015) found that after training, the professional and expert translators have a less obvious concept of and approach to translation compared with students due to the former prefers a static concept of translation while the latter has obtained more dynamic educational input.Kudo (2018) showed by experiment that using segmentation ambiguity as noise to enhance the accuracy and robustness of neural machine translation is feasible.Due to the highly-fragmented view on translation quality and the fundamentally-different quality evaluation methods between human and machine translation, there are some problems existing in requesters and users.In order to solve it, Lommel et al. (2014) developed the Multidimensional Quality Metrics (MQM) framework to declare and describe translation quality metrics through a shared vocabulary of "issue types".PACTE (2016) presents the results of the 2016 Machine Translation Conference, in which five machine translation tasks, three evaluation tasks, an automatic post-editing task and bilingual document alignment task are included.
The results of comparative analysis of keywords and citation burst show that "machine translation, translation competence, translation quality" are the common high-frequency topics in the two detections.Machine translation refers to the study of how to translate using computers, and it has long been considered one of the most challenging assignments in the natural language processing field (NLP) (Wang et al., 2022).The rise and popularity of machine translation triggered a wide-ranging exploration of translation quality and translation competence evaluation by scholars, since whether machine translation is as good as human translation is an interesting topic.It can be seen that the emerging hot topics recently in translation assessment field are mainly related to translation quality estimation and translation competence acquisition.

Conclusions
This study, taking valid papers on translation assessment published in WoS core collection from 2000 to 2022 as research samples, visualizes and reviews the research theme, research evolutions and emerging hot topics of translation assessment through bibliometric analysis, proposes implications for further translation assessment research and provides references for scholars in this field.
This paper shows the outcome of the BA of research papers on TA in the WoS core collection.By combing 990 literatures related to TA, the research themes, evolutions and hotspots are summarized.The research findings include:(1) Research on TA come across three stages of development: preliminary stage (2000-2007), growth stage (2008-2013), and rapid stage (2014-2022).The growth trend of this research field showed that translation assessment has received continuous attention from a certain number of scholars and formed its own relatively independent research field.(2) Research themes from 2000 to 2022 can be categorized into five themes, including translation competence, translation quality, machine translation, translation teaching and training, and others.(3) Research evolution of TA from 2000 to 2021 can be presented in several aspects: the rise of participants number, the interdisciplinary nature of the research content, the development of research angle, as well as research methods and publish journals.(4) The burst detections investigate the newly developing part in this field, which include: translation competence acquisition.translation quality estimation and machine translation evaluation.
It is undeniable that there are still some limitations in the paper.On the one hand, WoS is a representative and authoritative database, however, the same keywords are retrieved in other database, the number of articles are quite different.Therefore, optimizing translation assessment research based on database is the direction for future research.On the other hand, translation assessment is a topic involving many disciplines, like linguistics, education, computer science etc. Limited by the words number and interdisciplinary integration ability, the paper does not carry out a detailed and in-depth analysis from the aspect of interdisciplinary fusion.This will be the direction of further improvement in the follow-up research.

Table 2 .
High frequency keywords and centrality.