Wheat stripe rust disease (WRD) is extremely detrimental to wheat crop health, and it severely affects the crop yield, increasing the risk of food insecurity. Manual inspection by trained personnel is carried out to inspect the disease spread and extent of damage to wheat fields. However, this is quite inefficient, time-consuming, and laborious, owing to the large area of wheat plantations. Artificial intelligence (AI) and deep learning (DL) offer efficient and accurate solutions to such real-world problems. By analyzing large amounts of data, AI algorithms can identify patterns that are difficult for humans to detect, enabling early disease detection and prevention. However, deep learning models are data-driven, and scarcity of data related to specific crop diseases is one major hindrance in developing models. To overcome this limitation, in this work, we introduce an annotated real-world semantic segmentation dataset named the NUST Wheat Rust Disease (NWRD) dataset. Multileaf images from wheat fields under various illumination conditions with complex backgrounds were collected, preprocessed, and manually annotated to construct a segmentation dataset specific to wheat stripe rust disease. Classification of WRD into different types and categories is a task that has been solved in the literature; however, semantic segmentation of wheat crops to identify the specific areas of plants and leaves affected by the disease remains a challenge. For this reason, in this work, we target semantic segmentation of WRD to estimate the extent of disease spread in wheat fields. Sections of fields where the disease is prevalent need to be segmented to ensure that the sick plants are quarantined and remedial actions are taken. This will consequently limit the use of harmful fungicides only on the targeted disease area instead of the majority of wheat fields, promoting environmentally friendly and sustainable farming solutions. Owing to the complexity of the proposed NWRD segmentation dataset, in our experiments, promising results were obtained using the UNet semantic segmentation model and the proposed adaptive patching with feedback (APF) technique, which produced a precision of 0.506, recall of 0.624, and F1 score of 0.557 for the rust class.