Abstract
Labeo bata , a commercially important minor carp of India, represents a species with considerable aquaculture potential and regional cultural value. Despite its economic significance, genomic resources for this species remain scarce, limiting progress in genetic improvement and conservation programs. To address this gap, short-read whole genome sequencing was carried out, generating 18.3 GB of high-quality data (mean quality score: 35.57). The assembly yielded a genome size of 952,061,765 bp distributed across 176,874 contigs. Microsatellite mining using MISA identified a total of 765,937 simple sequence repeats (SSRs), comprising 183,181 dinucleotide, 84,624 trinucleotide, 63,784 tetranucleotide, 16,973 pentanucleotide, and 877 hexanucleotide motifs. Gene prediction with AUGUSTUS identified 45,822 putative genes, of which 31681 genes were annotated using NR database of Swissport and uniport additionally KAAS analysis received 28357 functional KO annotations in the KEGG database. Genome wide genic SSRs were mined in genic and intronic regions, within genes associated with different pathways i.e , Metabolism, Genetic Information Processing, Environmental Information Processing, Cellular Processes and Organismal Systems. These studies are the first comprehensive genome-wide Genic SSR loci resource for L. bata, opening avenues for the development of molecular markers, population genetic studies, and genomic-assisted breeding strategies. The outcomes of this study hold promise for enhancing aquaculture performance and ensuring the sustainable utilization of this valuable indigenous carp.