Databases, Digital Libraries and Portals
History and cultural Studies -- [top]
Galician Surname Maps(Cartografía dos Apelidos de Galicia, GSM) to provide a research tool for the study of the geographical distribution of surnames in Galicia. This employs a geographical information system combining statistical data with a spatial analysis. GSM provides facts about the frequency and distribution of surnames in Galicia based on the 2001 local census data made available by the Instituto Nacional de Estadística (Madrid) to the Real Academia Galega. This site informs on the number of instances of a surname registered in each council district and the percentage for that district, relative to the total number of surnames registered there. This numerical information is displayed on a thematic map of council districts, with the frequency of occurrence of a given surname indicated through a colour scale.
Gallaeciae Monumenta Historica (GMH) is a project of the Council of Galician Culture whose main purpose in publishing on the Web the rich medieval documentary corpus of Galicia and academic research related to it.
We designed a preliminar phase by digitalizing the twenty books of our Fontes Documentais para a Historia de Galicia collection. Now, in collaboration with other institutions that work directly in transcription, study and documentation of this corpus, we chose a very classical name to present the resource to the public. The name evokes the positivist tradition which was very common in many European countries to designate big works of compilation of the past (Germany or Portugal for example).
By taking rid of the possibilities of the new technologies, our work goes through the original edition and transcription to a new, comprehensive point of view on documents. We tagged each of one extensively, trying to avoid the ambiguity of medieval docummentation and exploring concepts, personalities and places regardless the original book in which they have been published. You can cross the whole database, in the pursuit of people, places, monasteries or hundreds of concepts.
Language. -- [top]
BILEGA é unha bibliografía analítica que achega información exhaustiva e actualizada sobre traballos de investigación, divulgación e opinión que teñen por obxecto a lingua galega en calquera das fases do seu desenvolvemento histórico.
The CLUVI Corpus of the University of Vigo is an open collection of parallel text corpora developed under the direction of Xavier Gómez Guinovart (2003-2012) that covers specific areas of the contemporary Galician language. With 23 million words, the CLUVI Corpus comprises six main parallel corpora belonging to five specialised registers or domains (fiction, computing, popular science, law and administration) and involving five different language combinations (Galician-Spanish bilingual translation, English-Galician bilingual translation, French-Galician bilingual translation, English-Galician-French-Spanish tetralingual translation and Spanish-Galician-Catalan-Basque tetralingual translation). For the sake of copyright protection, this public distribution of the CLUVI Corpus is limited to the 6 million Corpus
Corpus Documentale Latinum Gallaeciae é un corpus textual que se actualiza anualmente con documentación medieval en lingua latina relacionada con Galicia comprendida entre o século VI e o século XV, ambos inclusive, a partir das edicións dispoñibles. Nestes momentos (maio de 2022) conta con 18.873 documentos que recollen o contido de 236 edicións de fontes documentais. Cada documento, debidamente escaneado e limpo de toda información editorial non latina, conta cunha cabeceira individual composta por 46 campos de información arquivística, cronolóxica e editorial. Abrangue os textos recollidos nos cartularios e coleccións dos centros monásticos, Samos, Sobrado, Oseira, Celanova, Xubia, Lourenzá, Carboeiro, Caaveiro, Antealtares, San Martiño Pinario, Melón, Montederramo, Ferreira de Pantón, Pombeiro, San Clodio do Ribeiro, Ferreira de Pallares, Fiães etc., así como documentación das sés catedralicias de Santiago, Mondoñedo, Lugo, Tui e Ourense e Ordes Militares.
É un corpus documental integrado por distintos tipos de textos representativos da lingua galega actual que están codificados na linguaxe estándar XML (eXtensible Markup Language) e que abranguen cronoloxicamente dende 1975 ata a actualidade.
A documentary corpus including different text-types representative of present-day Galician which are encoded in XML language and cover the period ranging from 1975 to the present day. Codification is mainly applied to bibliographic information and document structure, and enables queries with or without regular expressions, by complete word, truncated word, and several words or word-parts, whether they are consecutive or not. Besides, users can restrict searches by applying a number of criteria such as time-span, subject field, type of document, or document area which can be combined depending on their needs.
POS-Tagged and Semantically Annotated Galician Technical Corpus
Gondomar é o resultado dun proxecto de investigación que ten por obxecto reunir nun repositorio dixital toda a produción escrita en lingua galega ao longo dos séculos XVI, XVII e XVIII e, ao tempo, elaborar e poñer ao dispor dos investigadores e do público interesado unha serie de ferramentas para a súa interpretación e o seu estudo.
Recolle a forma estándar de máis de 1.500 nomes que forman parte da nosa tradición antroponímica. Inclúe tamén as súas correspondencias con outras linguas e procura satisfacer a curiosidade sobre a historia ou lenda das personaxes que primeiro levaron estes nomes ou que os popularizaron.
RILG Portal (Integrated Language Resources for Galician) started in 2006 with the goal of promoting the integration, joint exploitation and dissemination of the textual and lexical resources of linguistic technology of the Galician language generated in different projects carried out by the Instituto da Lingua Galega (ILG) of the University of Santiago of Compostela, and by the Seminario de Lingüística Informática (SLI) and the TALG Group (Galician Language Tecnologies and Applications) of the University of Vigo. Since 2018, the RILG Portal includes the bUSCatermos multilingual terminology databank thanks to the collaboration of the Servizo de Normalización Lingüística of the University of Santiago of Compostela. More information about RILG dated 2009.
Trátase dunha base de datos en liña na que se dan a coñecer as denominacións galegas aprobadas para conceptos das distintas linguas de especialidade, con equivalentes nas principais linguas estranxeiras e románicas próximas:.
is a lemmatized and tagged corpus of Modern Galician developed by Instituto da Lingua Galega under the leadership of Antón Santamarina. In its current version (August 2018)), TILG covers 3.086 works by 704 authors written between 1612 and 2013, making it possible to query an integrated data base of over 30 million words belonging to 95,409 distinct headwords. There are two basic kinds of search, by word form or headword, plus a variety of refinements through advanced search queries.
Statistic information | Works included
Info in portuguese| Open source but required registration. Directors: Atónio de Carlos Moura Barros, antonmoura@imaxin.com. Angel López López, angel@imaxin.com. José Ramom Pichel Campos, jramompichel@imaxin.com.
The “Medieval Galician Computational Treasure” is a research project developed in the ILG (Institute of Galician Language) (coordinated by Xavier Varela and in agreement with the DXPL > SXPL -Linguistic Policy General Secretariat- of the Galician Government) and is accessible through the TMILG corpus (http://ilg.usc.es/tmilg). In total there are more than 12500 documents collected that date from the 13th century to the beginning of the 16th century. imaxin|software was in charge of developing both the corpus indexing and search engines. This feature allows the user to carry out customized searches within the Medieval Galician documents based on dates, genres, text typology, variants of the same word, agreement, regular expressions... There is no equivalent to this in any Romance Language. The corpus includes varied types of works: sacred or profane lyric poetry, technical prose, literary prose, historical prose, sacred prose and legal prose. One outstanding genre is the notarial prose, including substantial sacred and civil collections, monastic prose being the most prominent.
Literature. -- [top]
General| Centro Ramón Piñeiro | A dictionary that articulates a field of knowledge of such a long tradition as that developed around the literary fact in a broad sense. This project, now enhanced as a database, is by definition a dynamic character so that new voices will be incorporated and the wording of those already activated will be updated when necessary.
O proxecto Obras de Martín Sarmiento, nado en 2002 no seo do Consello da Cultura Galega e dirixido polo profesor Henrique Monteagudo, coa participación dun equipo interdisciplinario de investigadores, ten como obxectivo a edición da obra inédita de frei Martín Sarmiento, a través dun proceso de localización, recuperación, transcrición e estudo, ao tempo que se leva a cabo a elaboración dun rexistro exhaustivo de fondos manuscritos.
Base de datos de bibliografía especializada na literatura medieval galego-portuguesa. Recompila referencias á produción científica con dedicación á Lírica Profana Galego-Portuguesa, Cantigas de Santa Maria e Prosa Literaria Medieval, obxecto de estudo do Arquivo Galicia Medieval (ARGAMED).
Consta dun catálogo unificado de todos os textos orixinariamente compostos en galego-portugués, portugués e galego ou traducidos a estes idiomas durante o período medieval
A complete edition of the lyrics and music of the 13th century Cantigas de Santa Maria of Alfonso X El Sabio, specially prepared for singers and instrumentalists.
Corpus lexicográfico medieval da lingua galega.
constitúe o primeiro repertorio lexical dicionarizado, contextualizado e exhaustivo do corpus da lírica profana galego-portuguesa: cantigas de amor, cantigas de amigo e mais cantigas de escarnho e de maldizer, para alén dalgúns textos doutros xéneros con menor representación.
Esta base de datos pon a disposición dos especialistas e do público interesado o corpus completo das cantigas dos trobadores galego-portugueses. MedDB ofrece agora unha nova versión, que, entre outras novidades, inclúe as trascricións das Notas coloccianas e das Rúbricas explicativas presentes nos testemuños manuscritos. Ademais, ampliouse o espectro de campos de busca e actualizáronse tanto as edicións críticas dunha gran parte do corpus coma a información para as cantigas e trobadores en función dos avances progresivos da produción científica.
PalMed é unha base de datos que ofrece a transcrición paleográfica de todos os testemuños manuscritos da lírica galego-portuguesa. O seu obxectivo é facilitar o labor dos investigadores interesados na realización de estudos paleográficos, grafemáticos, lingüísticos e ecdóticos sobre a produción trobadoresca do occidente ibérico.
A presente base de dados disponibiliza, aos investigadores e ao público em geral, a totalidade das cantigas medievais presentes nos cancioneiros galego-portugueses, as respetivas imagens dos manuscritos e ainda a música (quer a medieval, quer as versões ou composições originais contemporâneas que tomam como ponto de partida os textos das cantigas medievais). A base inclui ainda informação sucinta sobre todos os autores nela incluídos, sobre as personagens e lugares referidos nas cantigas, bem como a “Arte de Trovar”, o pequeno tratado de poética trovadoresca que abre o Cancioneiro da Biblioteca Nacional.
This relational database contains interlinked information on the texts, manuscripts and miniatures of the Cantigas de Santa Maria.
Info in portuguese| Open source but required registration. Directors: Atónio de Carlos Moura Barros, antonmoura@imaxin.com. Angel López López, angel@imaxin.com. José Ramom Pichel Campos, jramompichel@imaxin.com.
The “Medieval Galician Computational Treasure” is a research project developed in the ILG (Institute of Galician Language) (coordinated by Xavier Varela and in agreement with the DXPL > SXPL -Linguistic Policy General Secretariat- of the Galician Government) and is accessible through the TMILG corpus (http://ilg.usc.es/tmilg). In total there are more than 12500 documents collected that date from the 13th century to the beginning of the 16th century. imaxin|software was in charge of developing both the corpus indexing and search engines. This feature allows the user to carry out customized searches within the Medieval Galician documents based on dates, genres, text typology, variants of the same word, agreement, regular expressions... There is no equivalent to this in any Romance Language. The corpus includes varied types of works: sacred or profane lyric poetry, technical prose, literary prose, historical prose, sacred prose and legal prose. One outstanding genre is the notarial prose, including substantial sacred and civil collections, monastic prose being the most prominent.
O proxecto Universo Cantigas ten como obxectivo realizar a edición crítica dixital dos textos da lírica profana galego-portuguesa.
Film and Music -- [top]
Includes sections about Proffesionals (actors, directors, producers, etc.), Films, etc.